Ch 19 - Disaster Recovery

Lesson 19-Disaster Recovery, Business Continuity, and Organizational Policies

Disaster Recovery

Whether from a virus, hacking or simply a fire or flood or other act of nature- there may come a time when you will experience a catastrophic loss of computer data.

How to prepare for a disaster and how plans to mitigate the disaster dictate how long operations are disrupted. – These events do not happen often. – It is more likely that business operations will be interrupted due to

employee error. – Plans/Process Backups– Utilities– Secure Recovery– High Availability and Fault Tolerance– Computer Incident Response Teams– Test, Exercise, and Rehearse

Categories of Business Functions

A BIA (Business Impact Assessment), and a DRP (Disaster Recovery Plan) categorizes functions based on how critical or important they are to a business operation.– Critical

• The function is essential for operations and without the function, the basic mission of the organization cannot be accomplished.

– Necessary for normal processing • The function is for normal processing, but the organization can do without it for a short period of time (such

as for less than 30 days).– Desirable

• The function is not needed for normal processing. It, however, enhances the ability to conduct its mission efficiently.

– Optional • If the function is not needed and no subsequent processing will be required to restore this function.

– If the function does not fall into any of the categories because it does not really affect operations.

• Consider eliminating this function

Categories of Business Functions

The plan to continue organizational operations is business continuity plan (BCP). – It focuses on the continued operation of the business or

organization. A BCP emphasizes the critical systems needed to operate.

– The BCP describes the functions that are most critical.– This is determined by the business impact assessment (BIA).– The BCP often describe the order in which functions should be

returned to operation. The focus of a disaster recovery plan (DRP) is on continued

operation after a disaster.

http://www.b2bcontinuity.com/businesscontinuityplanning.html

Disaster Recovery Plan

DRP (disaster recover plan) defines the data and resources (hardware, software and computer personnel) necessary and the processes and procedures required to restore critical processes.– The specific steps required to restore operations should be

documented. – They should be reviewed and exercised on a periodic basis.

It is essential the DRP needs is approved by management.. Exercising disaster recovery plans and processes before a disaster

helps discover flaws or weaknesses in plans. DRP should describe critical functions that are to be restored first

http://www.gartner.com/5_about/news/disaster_recovery.html

BIA

To create the BIA answer the following questions for all critical functions:– Who is responsible for the operation of this function?– What do these individuals need to perform the function? – Where will this function be performed?– When should this function be accomplished?– How is this function performed (what is the process)?– Why is this function so important or critical to an organization?

Backups

Backups are key to BCP or DRP - Hardware and storage media failure leading to corruption of critical data is a source of disaster. – The strategy should consider

• How frequently should backups be conducted?• How extensive do the backups need to be?• What is the process for conducting backups?• Who is responsible for ensuring that backups are created?• Where will the backups be stored?• How long will backups be kept?

Depending on the type of organization, there may be legal requirements for conducting backups that will affect the factors mentioned previously.

http://www.computerworld.com/hardwaretopics/storage/story/0,10801,100586,00.html

What Needs to Be Backed Up

A good backup plan will consider more than just the data. It will include:

• Application programs needed to process the data.• The operating system and utilities that the hardware platform

requires to run the applications.

The DRP should also address other items related to backups such as: – Personnel– Equipment– Electrical power

Types of Backups

There are four basic types of backups Full backup

– An occasional full backup must be conducted.– Later, when a delta backup is conducted at specific intervals, only the portions of

the files that have been changed will be stored. Differential backup

– Only the files and software that have changed since the last full backup An incremental backup

– Any file that has changed since the last backup (full or partial). The type selected will greatly affect the overall backup strategy, plans, and

processes. How frequently should backups be performed?

– You should consider how long an organization can survive without current data.

Backup Rule of Three

Multiple backups should be maintained. There are several strategies or approaches to backup retention and

a common and easy to remember is the “rule of three.”– This entails simply keeping the three most recent backups. When a new

backup is created, the oldest copy is overwritten. – In certain environments, regulatory issues may prescribe a specific

frequency and retention period.– It is important to know an organization and its requirements when

determining how often a backup will be created and how long will it be kept.

If you are not in an environment where regulatory issues dictate the frequency and retention, your goal will be to optimize the frequency.


To determine the optimal backup frequency, consider: – The cost of the backup strategy chosen.– The cost of recovery if the backup strategy is not implemented (meaning if there

were no backups created).– the probability that the backup will be needed on any given day.

The two figures to consider then are:– (probability the backup is needed AND cost of restoring with no backup)

• This figure is the probable loss that can be expected by an organization if there is no backup conducted.

– (probability the backup isn't needed AND cost of the backup strategy)• This figure is the price an organization is willing to pay (lose) to ensure that

you can restore, should a problem occur. To optimize backup strategy, the correct balance between these two figures needs to

be determined. – When working with these two calculations, it Is is a cost-avoidance exercise.


When calculating the cost of the backup strategy, consider:– The cost of the backup media required for a single backup– The storage costs for the backup media and the retention policy– The labor costs associated with performing a single backup– The frequency with which backups are created

The best strategy is to keep copies of backups in separate locations. – The most recent copy could be stored locally, as it is the most

likely to be needed. – Other copies can be kept at other locations.

Alternate Sites

Offsite Storage A recent advance is online backup services.

– A number of third-party companies offer high-speed connections for storing data on a frequent basis.

Where should restoration services be conducted? – If an organization has suffered physical damage to a facility, having offsite storage of data is

only part of the solution.– Data needs to be processed somewhere.

Hot site - A fully configured environment similar to the normal operating environment. Warm - Partially configured, usually having the peripherals and software but perhaps

not the more expensive main processing computer. Cold site - Basic environmental controls needed to operate. Has few computing

components needed. Mobile backup- Trailers with the required computers and electrical power that can be

driven to a location within hours of a disaster and set up to commence processing immediately.

Alternate Sites

A less expensive alternative is a mutual aid agreement. – Similar organizations agree to assume the processing for the

other party with the following assumptions• both organizations will not be hit by the same disaster. • both have similar processing environments.

– Such an arrangement may not be legally enforceable, even if it is in writing.

Long-Term Storage of Backups

Depending on the media: – Magnetic media degrade over time (measured in years).

• Tapes can be used a limited number of times before the surface begins to flake off.

• Storage of media in a steel safe or file cabinet may accelerate the process. Other considerations

– Software applications also evolve, and the media may not be compatible with current versions of the software.

– If the file you stored is encrypted, then passwords are needed to decrypt the file to restore the data.

.

Utilities - Power

Emergency power must be planned for in case of disruption For short-term interruptions a UPS may suffice. Beyond a few minutes, another source of power is required.

– Backup generators are not a simple, maintenance-free solution.

• Generators should be tested on a regular basis.• They can become strained if they power too much

equipment, therefore, ensure the reserve capacity is beyond the anticipated load.

• They take time to start up. A UPS should be used to for a smooth transition to backup power.

– Generators are expensive and require fuel – they should be kept in a place where they can be fueled.

Utilities - Environmental

Environmental conditions. – Air conditioning.– Mobile backup sites use trailers and rely on generators for

their power but also factor in the requirement for environmental controls.

– Depending on the disaster, telephone and Internet communication may be lost.

– Wireless services may also not be available. Planning redundant communication can help with most

outages.– Backup plans should include the option to continue

operations from a different location while waiting for communications to be restored.

Secure Recovery

In the event an organization’s operations are disrupted, several companies offer recovery services that can remotely provide restoration services for critical files and data. These may include:– Power– Communications – Technical support

For the physical sites and the remote service—security is an important element and must be ensured.

• Confidentiality• Integrity• Availability

High Availability and Fault Tolerance

High availability refers to the ability to maintain data and operational processing despite any disruption. – It requires redundant systems for both power and processing.

Fault tolerance refers to availability and is accomplished by the mirroring of data and systems. – Should a “fault” occur that disrupts a device such as a disk

controller, the mirrored system provides the requested data with no interruption in service.

Computer Incident Response Teams

A plan should include establishing a Computer Incident Response Team (CIRT) or a Computer Emergency Response Team (CERT). – The team should be created and team members notified before an incident

occurs. – The team includes technical and non-technical individuals who provide guidance

on ways to handle media attention, legal issues, management issues.– The team consists of permanent and ad hoc members.

The CIRT conducts investigations of the incident and makes recommendations about how to proceed. – Policies and procedures for investigation should also be worked out in advance. – It is also advisable to have the team periodically meet to review these

procedures.–

http://labmice.techtarget.com/security/incidentresponse.htm

Test, Exercise, and Rehearse

The BCP, DRP, backup procedures, or method to address computer incidents and other plans should be tested.– all parties should practice the established procedures. – As many recovery functions as possible should be performed– Care should be taken not to impact actual operations.

Rehearsal of portions of the recovery plan should include:– Items that are disruptive to actual operations.– Items identified as needing more frequent activation due to

either the importance or the need for continual practice

Policies and Procedures

Policies are high-level statements made by the management laying out an organization's position on some issue. – Policies are mandatory but are not specific in their details. – Policies are focused on the result – not the methods for achieving

that result. Standards are specifications providing specific details on how

a policy is to be enforced. Procedures are step-by-step instructions describing exactly

how employees are expected to act in a given situation or to accomplish a specific task.

http://www.sans.org/resources/policies/#name

Policies and Procedures

There are security policies that every organization should have in place. – Acceptable use– Due care– Separation of duties– Password management

Other important policy-related issues include:– Privacy– Service level agreements– Human resources policies– Code of ethics– Incident response

Security Policies

The security policy should describe how security is handled from an organizational point of view.– It describes which office and corporate officer or manager oversees the organization's security.

The security policy should be reviewed regularly and updated as needed. Policies should be updated less frequently than the procedures that implement them.

– High-level goals do not change as often as the environment in which they must be implemented.

All policies should also be reviewed by a legal counsel.– A plan should be outlined describing how all employees will be made aware of the policies.

Policies can be made stronger by including references to the authority who made the policy.– For example, whether this policy comes from the CEO, or is a department-level policy.– Refer to any laws or regulations applicable to the specific policy and environment.

Acceptable Use

An acceptable use policy (AUP) outlines the appropriate use of company resources, computer systems, and networks. – It should delineate what activities are not allowed.

The policy should consider:– Use of resources to conduct personal business– Installation of hardware or software– Remote access to systems and networks– Copying of company-owned software– User responsibility to protect company assets– Data, software, and hardware

Statements regarding penalties for violating the policies (such as termination) should also be included.– Penalties should not outweigh the related offense.

http://compnetworking.about.com/library/weekly/aa021700a.htm

Acceptable Use

The appropriate use by the organization. – Is it appropriate to monitor an employee's use of the systems and network?– If any information gathered during monitoring is used in a civil or criminal

case, be able to answer the following questions:• Did the employee have an expectation of privacy? • Was it was legal for the organization to be monitoring?

– The statement that states that use of the system constitutes consent to monitoring should be referred.

Legal council should be consulted before any monitoring is conducted.– The actual wording of the warning message is important. It establishes

certain rights and responsibilities.

Internet Usage Policy

The Internet usage policy ensures employee productivity and limits liability from inappropriate use of the Internet – This policy addresses which sites employees are allowed to

visit. – If the company allows employees to surf the Web during non-

work hours, the policy should spell out acceptable parameters including times and prohibited sites.

– The policy should describe under what circumstances an employee would be allowed to post something from the organization's network on the Web.

E-Mail Usage Policy

The e-mail usage policy like the Internet usage policy.– It states what the company will allow employees to send in

terms of e-mail. • The policy should spell out if non-work e-mail traffic is allowed or is

restricted.• It should cover the type of message that would be considered

inappropriate to send to other employees.• The policy should specify any disclaimers that must be attached to

an employee's message sent outside the company.

Due Care

Due care and due diligence are terms used in the legal and business community to address issues where one party's actions may have caused loss or injury to another party.

The law recognizes the responsibility of an individual or an organization to act reasonably. Reasonable actions should to be taken to demonstrate that the organization is being

responsible. – Organizations should protect the information that it maintains on individuals. – The standard applied—reasonableness—is subjective and is often be determined by a

jury. The organization must show it has taken reasonable precautions to protect the information.

– Despite these precautions, an unforeseen security event occurred that caused the injury to the other party.

Many sectors have “security best practices” for their industry that is a basis for due care in that sector.– If the organization does not follow the industry best practices, it should be prepared to

justify its actions in court should an incident occur. In any case, you must have a security policy.

http://www.net-security.org/article.php?id=777

Separation of Duties

Separation of duties is a principle employed in many organizations to ensure that no single individual has the ability to conduct transactions alone. – Trust in any one individual is lessened. – The risk of catastrophic damage to the organization is also decreased.

Separation of duties as a security tool is a good practice. – It is possible to go overboard and break transactions up into too many pieces or

require too much oversight. • This yields inefficiency and perhaps less secure. • Individuals may not scrutinize transactions thoroughly as they know others

will review them. Separation of duties spreads responsibilities over an organization so no single

individual becomes indispensable. – No one has the “keys to the kingdom” or unique knowledge about how the

system works. – Each task should have a primary and backup person.

Need to Know

Need to know and least privilege are principles that ensure that each individual in the organization is only supplied with the minimum amount of information and privileges they need to perform their tasks.

To obtain access to any piece of information, they must justify why they “need to know” it.– They will only be granted the bare number of privileges needed

to perform their jobs.– A policy spelling out these principles as guiding philosophies

should be created and address who may grant access or assign privileges.

http://www.pcworld.com/resource/article/0,aid,120314,pg,1,RSS,RSS,00.asp

Password Management

A password management policy should address:– The procedures for selecting passwords (If allowed)

• length, character set, etc.

– The frequency with which they must be changed, and how they will be distributed.

– Procedures for creating new passwords should an employee forget password.

– The acceptable handling of passwords. – Password cracking by administrators.

• It discovers weak passwords selected by employees.

Disposal and Destruction

Many intruders know the value of “dumpster diving” – Documents, letters, scratch paper, old removable storage media, and

old equipment all have value. – Finding e-mail or sticky notes with passwords and userids.

Organizations must have a strong disposal and destruction policy – Important papers should be shredded. – Magnetic storage media should have all files deleted.

• The media should be overwritten at least three times with all ones, all zeros, and then random characters.

• A safer method is to destroy the data by using a strong magnetic field to degauss the media.

• One can even use a file to remove the magnetic material from the surface of the platter.

• Shredding floppy media is normally sufficient.

Privacy and Service Level Agreements

Organizations should have a privacy policy that explains the guiding principles used to guard personal data they access. – Customers have a legal right to expect that their information is kept private.

• Organizations violating this trust may be sued.

– In the field of health care, federal regulations have been created that prescribe stringent security controls on private information.

Service level agreements (SLAs) are contractual agreements between entities specifying levels of service that two organizations agree upon. – These agreements delineate expectations in terms of the service provided,

support expected, and penalties for failure. The SLA should include a section on the service provider's responsibility for

business continuity and disaster recovery as well as the provider's backup plans and processes for restoring lost data.

Human Resources Policies

You should hire individuals that can be trusted with your data and that of your clients. – You need policies to address when employees leave the organization.

Run background checks on prospective employees and check references. – Drug tests– Past criminal activity– Educational background – Work history

For sensitive environments, security background checks are required. Once hired, you should minimize the risk that the employee will “turn against you.”

– Periodic reviews by supervisory personnel, additional drug checks, and monitoring of activity during work should be considered.

– Your ability to monitor may be restricted if not spelled out in advance New employees should be made aware of all pertinent security policies.

– Documents should be executed acknowledging that they have read and understood them.

– Changes to privileges should be clearly spelled out in their policies.

Employee Retirement, Separation, or Termination

Employees who retire by choice may announce their retirement weeks or even months in advance.– Limiting access to sensitive documents the moment they announce their

intention is the safest thing to do, although it may not be necessary. – Each situation should be evaluated individually. – If they leave for a better job, you may decide to allow them to gracefully transfer

their projects to other employees.– The decision should be considered carefully if the new company is a competitor.

If the employee is terminated, immediately revoke their access privileges:– Access cards, keys, and badges should be collected. – The employee should be escorted to their desk and watched as they pack their

personal belongings, and escorted from the building. – Combinations should be changed quickly once they have been informed of their

termination.

Code of Ethics

Professional organizations have established codes of ethics for their members. – Each of these describes the expected behavior of their members from a

high-level standpoint. For organizations, a code of ethics can set the tone for how

employees will be expected to act and conduct business. – The code should:

• Demand honesty from employees. • Require that they perform all activities in a professional manner. • Address principles of privacy and confidentiality.• State how employees should treat client and organizational data. • Address conflicts of interest.

http://www.sans.org/resources/ethics.php

Incident Response Policies

An incident response policy and procedure should be developed to outline how the organization will deal with security incidents when they occur. – Preparation– Detection– Containment and Eradication– Recovery– Follow-On Actions

Preparation

Preparing for an incident is the first phase and steps should be established. – The points of contact should be determined and employees should be trained. – The equipment necessary to detect, contain, and recover from an incident should

be acquired; and those who will use the equipment should be trained.– Training in computer forensics should also be accomplished.

Incidence Response Team membership varies depending on the type of incident:– A higher-level manager who can obtain the cooperation from employees as needed

should lead.– A computer or network security analyst is useful, as well as security office.– Specialists may be added for specific hardware or software platforms as needed.– The organization's legal counsel should be part of the team. – The public affairs office should be available on an as-needed basis to formulate the

public response should the event become public. – There should be a point of contact for the team in case criminal activity is

suspected.

Detection

The most likely group of individuals to discover an incident will be the network and security administrators via the organization's firewalls and IDSs.

Procedures should be established to check for possible security events. The tools and training should be identified during the preparation phase

described. Social engineering policy and training is essential. A reporting procedure needs to be in place for the employees. Everyone should know who to call and what to do if something is suspected.

– A reporting template should be given to any individual suspecting an incident

– One of the first jobs of the incident response team is to determine if an actual incident has occurred.

– Each reported incident must be investigated and treated as a possible incident.

Containment and Eradication

The team should quickly contain the problem. Decide whether to restore operations or prosecute. If the decision to prosecute is made, specific procedures need to be

followed in handling potential evidence. A decision that should be made quickly is how to address containment. If an intruder is on the system, one response is to disconnect from the

Internet until the system can be restored and vulnerabilities patched. Once the incident has been contained and malicious software or

vulnerabilities have been taken care of, the procedures described earlier should be put into action. – The goal should be to have the organization back to normal processing as soon as

possible.

Follow-On Actions

Once the operations have been restored to their pre-incident state, a few details need to be taken care of.– Senior-level management should be informed of what occurred.– Recommendations should be made to improve processes and

policies so that a repeat will not occur. – If prosecution of the individual responsible is desired, then

additional time will be spent in helping law enforcement agencies and in possible testimony.

– Training material should be developed or modified as part of the new policies and procedures.