Audit and Assurance

Expand all | Collapse all

Development of a Disaster Recovery Plan (DRP)

  • 1.  Development of a Disaster Recovery Plan (DRP)

    Posted 03 Feb, 2020 09:04
    Hi everyone,

    In an IT audit observation on the lack of a DRP, I wanted to know how about does one go developing the DRP? Especially that measures such as

    - Business Impact Analysis,
    - RPO & RTO
    - Test plans and schedules
    - etc..
    is required as part of the DRP.

    Finally what is the final look of the deliverable? is it a document with the above measures and outcomes documented?

    Please advise as the auditors keep following up with the auditee on the lack of a DRP, when the auditee does not necessarily know how about to do it ?​

    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------


  • 2.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 04 Feb, 2020 09:14
    Hi Bader.  Is there a business continuity plan?  I usually would recommend the creation of a business continuity plan with a corresponding disaster recovery plan.  IT would work with business operations to develop these plans based on the results of a BIA.

    It is difficult to answer the rest of your questions without knowing your business/industry/scope.  For example, are you isolating this to BC/DR planning for one specific critical business process (i.e., ERP application) or the entire Corporation.  For the latter I've seen dedicated BC/DR staff that create plans, provide training and awareness, periodic testing, update plans, etc.

    Lastly, it is not Audit's responsibility to do management's work.  IT Audit has identified an observation that requires action to mitigate the risk.  Your recommendation could be for the Client to coordinate a BC/DR steering committee represented by business operations and IT operations to determine an appropriate strategy to address BC/DR planning in the observation noted by IT Audit.

    ------------------------------
    Dominic Pasqualino
    Director, ISACA Philadelphia Chapter
    ------------------------------



  • 3.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 05 Feb, 2020 06:08
    ​Hi Dominic,
    Thanks for your reply.
    The observation recommends to establish an enterprise wide BCM process, then it goes on to recommend that within the BCM an IT business continuity plan with Business Impact Analysis , RTO and RPO, Test plan and schedule etc. should be established.

    My understanding from your point is that  this should be a joint effort between business operation and IT to develop the BCP and DRP. However after my discussion with IT and their desire to close this point, IT is willing to do it alone and cover the DRP part that they can control. So they asked me for guidance.
    So I guess my question now is
    1)  Can IT do it alone and develop their DRP in isolation of the rest of the BCP?
    2) Can IT Audit provide guidance?

    Sorry if this is confusing but I would really like your opinion about this.

    Thanks

    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------



  • 4.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 05 Feb, 2020 07:04
    I'm not sure if I can be of help here, but:

    - In order to be able to come up with a BC/DR strategy, the company must first decide what the critical business processes are.(as well as the dependencies with other "supporting" services)
    - Next, they need to determine how long they could sustain an outage (RTO), and to what extent they can loose data (RPO)
    - This, and many other things is combined into a BIA that documents the impact of an outage. Also dependencies, stakeholders, etc should be mentioned
    - Next, we can start to think about BCP ... How can we assure these processes keep running.
    - A part of this BCP is DR. What must be done, in what order, to get these processes back up and running (based on the RTO/RPO, that was determined earlier)
    - In terms of testing, the timeframes can be mandated by some standard/regulation. Or you could enforce a yearly/quarterly/monthly test.

    It is advisable though that when major changes happen, the BCP/DR is also reviewed together with the BIA. (If not, your BCP/DR may become a risk too)

    In terms of testing, there are many ways to do tests. Going from meetings where scenarios are evaluated, up to a controlled failover, up to automated failovers, up to rebuilding an entire datacenter, including data restoration.....

    In terms of guidance, a Steering Committee, including the major stakeholders, may be an option, as they should be able to determine the general directions (policies). These can then be worked out by Work Groups, that include the stakeholders for specific processes.

    ------------------------------
    Sven De Preter

    Sr. Network & Systems Administrator
    Corporate DPO Team Member

    Certs:
    - CompTIA CSCP (Stackable)
    - CompTIA CCAP (Stackable)
    - CompTIA Cloud+ ce
    - CompTIA Security+ ce
    - CompTIA Network+ ce

    Feel free to connect with me on LinkedIn: https://www.linkedin.com/in/svendepreter/
    ------------------------------



  • 5.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 05 Feb, 2020 15:23
    Hi Bader.  Here are answers to your questions:

    1) I would advise against IT developing DRP in isolation.  The purpose of BC/DR planning is to resume and recovery business operations in the event of an emergency or disruption to the business.  IT is a service provider to the business; without knowing/verifying what the business requirements/needs are and what is most important it would be impossible for IT to develop an effective DRP.

    For example, IT may not include third-party applications (i.e., SaaS solutions) that they do not support under the assumption that DR would be the vendor's responsibility.  However, has IT reviewed those contracts to see what the terms and conditions say regarding DR?  I will guarantee that some responsibilities will fallback on the Company and IT would need to address that in their DRP.

    2) Yes, IT Audit can provide guidance.  My recommendation and Sven's post both pointed to creating a BC/DR planning Steering Committee.  IT Audit can be a valuable member of the steering committee in an advisory role.  It really should be one overall plan that covers BC needs and how DR supports these needs based on the results of the BIA and RTO/RPO.

    I hope you find this helpful, please let me know if you have any further questions.  Thanks.

    ------------------------------
    Dominic Pasqualino
    Director, ISACA Philadelphia Chapter
    ------------------------------



  • 6.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 06 Feb, 2020 05:38
    Hi Sven and Dominic,

    Thanks for your replies.

    It seems logical to identify your critical processes and systems as well as your RTOs and RPOs and all the components of your BIA before developing any recovery / continuity strategy.

    To actually implement this is a totally different story especially if your critical processes and functions are invested into one large ERP system with several operating units and organizations. If you are going to create a steering committee whom do you call? Who do you call from IT? Should there be a third party involved to act as the project manager?

    I thought I would dig deeper into this issue hoping that you shed more light. Thanks again.


    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------



  • 7.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 06 Feb, 2020 06:36
    Hi @Bader Abuhilal,

    One thing you can wonder about, that may set you off in the right direction is.
    As everything is consolidated in one large ERP, it will probably be a modular system with modules for HR, Logistics, Production, IT, .....​

    In order to create a BCP/DR plan, you can also look at it from an inverted point of view.
    Let's say that there is a disaster .... in what order should the system be rebuilt? What modules should be online and working first?
    Or, what would be needed to redeploy the entire system? Database server? Active Directory ? Web Servers? What are the specs on those?
    What data would be required? Where are the backups stored? How long does it take to fully restore the database(s)? How long does it take to reinstall software? (Is it custom code? Where can it be found? How recent is it?)

    In terms of personel, you can wonder, who would you be needing at that time? Internal workers? Suppliers? Vendors? Partners? And what are their response times and responsabilities? (That is, it basically defines your disaster recovery crew). What if some of them are no longer with you? or have sought better places to work? Any fallbacks?

    In terms of finances, what do you need?
    In terms of hardware / facilities, what do you need?

    As an example, I could have a sales platform that involves many services and servers. We try to treat it as a whole. But.... in terms of priority...
    We want the sales and the front-end related stuff up and running as soon as possible. And we want it to run close to 100% uptime. (don't we all).
    So basically, we want clients to be able to do purchases.

    In terms of payments on this system, you could have a maintenance page, stating that email with payment details will be sent later, as the system is currently under maintenance. This also delays shipping of the goods. Or ordering goods at your supplier.
    Is this the way to go? Maybe. It all depends on what the company wants.

    So, in terms of our platform, this brings us to:
    - DR PRIO 1 : Sales Module / Maintenance page on the Payments / Web Frontends / Shops
    ---> postgress database server (Ubuntu 27.00 / Postgress 299 / 4 CPU / 32 GB ram / 1 TB Disk )
    ---> Sales backend webserver (Ubuntu 27.00 / Apache 26 / PHP 7.9 / .........)
    ---> Internet connectivity (min. 100Mbps, public ip requirements, public DNS)
    ---> Firewall setup with 3 DMZ (10.1.0.0/24, 10.2.0.0/24, 10.3.0.0/24)
    ---> ....
    - DR PRIO 2 : Payments & Shipping notifications
    - DR PRIO 3 : Ordering goods at suppliers / Shipping to customers

    Basically, these priorities are also our BCP priorities. Do note that it's more complex. Just giving this as an example.

    In terms of dependencies, and i'll make it simple:
    - PRIO 1 depends on database server, authentication server, webservers, api-servers, internet connectivity
    - PRIO 2 depends on module configurations for electronic payment processing & accounting systems
    - PRIO 3 depends on -api calls to external suppliers & warehousing.

    For the steering comittee & C-levels.... it al starts with the board of directors. What do they want to be up and running? Here a decision should be made on what needs to be up first, and what the order of the "DR plan is".
    Next, workgroups can be created, and tasked for each phase of the plan.

    So for PRIO 1, we could have the following team as stakeholders (in this case, coordinators may be the better choice of wording)
    - Networking Team Lead
    - Database Leader
    - Security Leader
    - External Suppliers (DNS & Internet)
    - Facility Leader
    - Hosting Team (Server Setup)
    - Application Owner
    - Customer Service Rep (for the communication)

    In PRIO 2, this may shift to
    - Networking Team Leader
    - Security Leader
    - Payment Leader
    - API Development Leader
    - Hosting Team
    - Bank (for testing of the payment transactions)

    In PRIO 3, this may be another set.

    These people will all depend on each other, and may be involved in different restoration tasks, hence the definition of priority during the DR / BCP.

    I hope this makes sense. ;) At least, this is how I would do it. ;)

    ------------------------------
    Sven De Preter

    Sr. Network & Systems Administrator
    Corporate DPO Team Member

    Certs:
    - CompTIA CSCP (Stackable)
    - CompTIA CCAP (Stackable)
    - CompTIA Cloud+ ce
    - CompTIA Security+ ce
    - CompTIA Network+ ce

    Feel free to connect with me on LinkedIn: https://www.linkedin.com/in/svendepreter/
    ------------------------------



  • 8.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 06 Feb, 2020 08:07
    ​Hey Sven

    That's a lot of food for thought. Thanks a lot. I think I got the idea.

    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------



  • 9.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 06 Feb, 2020 09:51
    Hi Bader.  I would advise that a BC/DR Planning committee is sponsored by Senior Leadership and a steering committee is created with key stakeholders in mind.  To take from Sven's example, "As everything is consolidated in one large ERP, it will probably be a modular system with modules for HR, Logistics, Production, IT, ...." The steering committee should have a representative from each module and IT should have a representative that supports those modules.  If you have a project management office I would assign a project manager to coordinate everything and manage this project.

    Another idea to get buy in for the need to develop integrated plans would be to host a tabletop exercise.  Usually you would perform this exercise with plans in place to identify issues with your plans.  However, going through it with no plans will show that there is a need to address this major risk and as a team.  Here's a good resource that can help: https://www.fema.gov/emergency-planning-exercises.  Additionally, speak with your Risk Management department who handle insurance policies for the Company.  Ask them to contact one of their insurance carriers and ask if they would be willing to come into your office and moderate/run a tabletop exercise for your company.

    ------------------------------
    Dominic Pasqualino
    Director, ISACA Philadelphia Chapter
    ------------------------------



  • 10.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 06 Feb, 2020 15:28
    Hi Bader,

    You got some excellent feedback.  You will need Senior Management support and VP support to get this initiative rolling.   Please review each SLA (Service Level Agreement) for each system vendor.  Since most applications are cloud based and data backed up remotely then your company's risks are minimized.  You may want ask for the SOC reports of the third party vendors to ensure they have Business continuity and Disaster Recovery procedures in place.    If your company supports the System in house the your DRP will be specific based on technology and supporting network infrastructure.  You need to meet with IT management and IT infrastructure team.

    A DRP supports the Business Continuity Plan (BCP).  Here is  draft framework for the BCP

    1. Business Impact Analysis and Risk Assessment

      Each business unit that is determined by the Company to provide essential functions shall conduct a Business Impact Analysis and Risk Assessment. The Business Impact Analysis will identify essential functions and workflow; determine the qualitative and quantitative impacts of a vulnerability/threat to essential functions, prioritize/establish recovery time objectives for the essential functions, and if appropriate, establish recovery point objectives for essential functions. The Risk Assessment will identify vulnerabilities and threats that may impact the business units' ability to fulfill the mission of the Company and define the controls in place to reduce the exposure to the vulnerabilities/threats.

      The Business Impact Analysis and Risk Assessment shall be approved/signed-off by the head of the business unit and the Business Continuity Coordinator or the Business Continuity Planning Committee, and retained
    2. Business Continuity Plan

      Each business unit that is determined by the Company to provide essential functions shall develop a Business Continuity Plan that reflects sufficient forethought and detail to ensure a high probability of successful maintenance or restoration of essential functions following an unfavorable event. To assist in the accomplishment of this goal, the following elements in sample plans, (Create your templates) will be of value in developing individual department plans. Such elements include, but are not limited to:
      1. Listing and prioritization of essential functions, including the identification of staffing and resource requirements, mission critical systems and equipment, and support activities for each essential function.
      2. Lines of Succession/Delegation of Authority for key Company positions, including guidance for the delegation of emergency authorities.
      3. Alternate Operating Facilities, including provisions to sustain operations for a period of up to thirty days (or other time frame as determined by the Company)
      4. Communications, including procedures and plans for communicating with internal personnel, other agencies, and emergency personnel.
      5. Protection and safeguarding of vital records and databases.
      6. Tests, Training, and Exercises to familiarize staff members with their roles and responsibilities during an emergency, ensure that systems and equipment are maintained in a constant state of readiness, and validate certain aspects of the Business Continuity Plan.


    The Business Continuity Plans shall be approved/signed-off by the head of the business unit and the Business Continuity Coordinator or the Business Continuity Planning Committee, and retained. The Company contact  shall perform an administrative review of the Business Continuity Plans at least annually or more frequently as needed. The "reviewed as of date" shall appear on the plans after each review.

    1. Testing and Exercising Plans

      Business units shall test some part of their Business Continuity Plan once a year, with all parts tested every seven years. An actual event necessitating activation of the Business Continuity Plan will meet this requirement. At the completion of each test or review, full documentation of test results and lessons learned shall be completed in the form of a Corrective Action Plan or After Action Report. Such reports shall be approved/signed-off by the head of the business unit and the Company Business Continuity Coordinator or the Business Continuity Planning Committee, and retained
    2. Plan Maintenance

      Business units shall review their Business Continuity Plan and tests at least annually or more frequently as needed and update the plans whenever changes occur in their operating procedures, processes, or key personnel. Plans must be updated to maintain accurate lists of key personnel, telephone numbers, and plan elements that may be affected by changes in unit structure or functions. The updated Business Continuity Plans shall be approved/signed-off by the head of the business unit and the Business Continuity Coordinator or the Business Continuity Planning Committee and retained.
    3. Communication

      Ongoing communication of business continuity activities to the Company's Departments shall be provided in a variety of methods as determined by designated Company Contact.
    4. Training

      Initial training on conducting business continuity planning shall be provided to all individuals responsible for developing and implementing plans. Additional and/or repeat training shall be provided as determined necessary by the Business Continuity Coordinator or the Business Continuity Planning Committee following the review of written plans and plan testing.
    5. Record Retention

      The Company shall retain business continuity records, including those indicated in for a period of not less than X # of

     



    ------------------------------
    Sal Rodriguez
    Director of Internal Audit
    CISA, CIA, CRMA, CCSA, CGAP, CICA, MBA, MS
    ------------------------------



  • 11.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 10 Feb, 2020 02:07
    ​Guys this is a good discussion and I appreciate your responses and insight.

    One final thought on this topic: If it is a centralized ERP system and backup is taken instantly or very close to instantly, then RPO and RTO objectives are not relevant and if IT is capable of resuming full business operation from the Disaster recovery site (where backup is taken) then what's the point of having a DRP?

    Should the audit function insist that a DRP be done? or should audit accept this scenario but see evidence of testing and the training only? how detailed should the plan be?

    I hope I am clear I look forward to your reply.

    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------



  • 12.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 10 Feb, 2020 02:13
    RTO and RPO are basically metrics that come into play when a disaster happens.

    Let's assume your datacenter burns down. How long would it take you to restore it?
    Are your backups consistent? Did you try restoring them to other hardware? If the data is inconsistent, your backup is useless ;)
    That's what DRP is about. It's all trashed in some way, and you need to get back into business asap.

    In terms of should it be done, or not, that's a decision/risk that needs to be evaluated by senior management.
    Is it acceptable that you cannot restore your services? Is it acceptable that you may lose more data than defined by your RPO? Is it acceptable that the restore will take 2 weeks instead of 1 day?

    You can also look at it the other way.... if RPO/RTO is irrelevant, then why should you perform backups anyway? ;)

    ------------------------------
    Sven De Preter

    Sr. Network & Systems Administrator
    Corporate DPO Team Member

    Certs:
    - CompTIA CSCP (Stackable)
    - CompTIA CCAP (Stackable)
    - CompTIA Cloud+ ce
    - CompTIA Security+ ce
    - CompTIA Network+ ce

    Feel free to connect with me on LinkedIn: https://www.linkedin.com/in/svendepreter/
    ------------------------------



  • 13.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 10 Feb, 2020 04:25
    Well said. Thanks again.

    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------



  • 14.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 10 Feb, 2020 07:49
    Can anyone provide a good informative template for A typical DR Plan?

    ------------------------------
    Bader Abuhilal
    Information Systems Auditor
    ------------------------------



  • 15.  RE: Development of a Disaster Recovery Plan (DRP)

    Posted 11 Feb, 2020 04:20
    Hi Bader,

    As you may know, the IT DR plan is a subset of the organization BCP. The BCP would have identified organizations Prioritised processes and activities.  Many of those processes and activities will have ICT dependencies and also predefined time frame within which the technology (ICT) resources should be available for the business processes to resume.
    In short, the IT DR Plan should be built around this principle. I am not in a position to share some of the DR Plans I have developed or audited as I have to comply with client NDAs.
    However, some good ideas can be picked up here. https://searchdisasterrecovery.techtarget.com/tip/Top-five-free-disaster-recovery-plan-templates
    I am not recommending you to follow it, but you can get a good insight into it.
    Best regards
    Nalin

    ------------------------------
    Nalin Wijetilleke MBA, CISA, CGEIT, FBCI, PMP, CMC
    2019 Online Forum Topic Leader
    Managing Director, ContinuityNZ Ltd.
    ------------------------------