28
#AIIM14 #AIIM14 #AIIM14 The Good, the Bad, and the Ugly of Defensible Disposi7on Richard Medina Cofounder and Principal Consultant, Doculabs | doculabs.com [email protected] | richardmedinadoculabs.com @richarddoculabs

The Good, The Bad, and The Ugly of Defensible Disposition

  • Upload
    aiim

  • View
    698

  • Download
    1

Embed Size (px)

DESCRIPTION

Most organizations hoard and fail to destroy their piles of files in a legally defensible manner when business and law allow. How do you tackle the monster problem of over-retention of electronic information? The session, Rich shows how to develop and execute the four most important steps in defensible disposition: the Defensible Disposition Policy, Assessment Plan, Technology Plan, and Disposition Plan. He’ll outline business case development and tool selection.

Citation preview

Page 1: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  #AIIM14  

#AIIM14  

The  Good,  the  Bad,  and  the  Ugly  of  Defensible  Disposi7on  

Richard  Medina  Co-­‐founder  and  Principal  Consultant,  Doculabs  |  doculabs.com  

[email protected]  |  richardmedinadoculabs.com  @richarddoculabs  

Page 2: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Issues  1.  The  problem  

§  The  sky  is  falling  again  

2.  Break  it  into  two  problems  §  Day-­‐forward  versus  historical  content  

3.  How  to  address  historical  content  §  A  defensible  disposi2on  methodology  

4.  Analysis  and  classificaLon  technology  §  Should  you  use  it?  Does  it  work?  

5.  Doing  the  Assessment  §  Approaches  and  results  

Page 3: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Issues  1.   The  problem  

§  The  sky  is  falling  again  2.  Break  it  into  two  problems  

§  Day-­‐forward  versus  historical  content  

3.  How  to  address  historical  content  §  A  defensible  disposi2on  methodology  

4.  Analysis  and  classificaLon  technology  §  Should  you  use  it?  Does  it  work?  

5.  Doing  the  Assessment  §  Approaches  and  results  

Page 4: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

The  Problem  is  Over-­‐Reten7on  OrganizaLons  have  been  over-­‐retaining  electronic  informaLon  and  failing  to  dispose  of  it  in  a  legally  defensible  manner  when  business  and  law  will  allow  

Retaining  everything  forever  

Disposing  of  everything  immediately  

Having  employees  make  classificaLon  decisions  

Having  technology  make  classificaLon  decisions  

Hybrid  with  technology  and  people  

Page 5: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Why  Over-­‐Reten7on  is  the  Problem  

§  Organiza2ons  keep  non-­‐required  electronic  content  forever  because:  1.  Classifying  content  (to  determine  what  to  keep  and  what  to  purge)  is  

manual  and  expensive  

2.  Content  worth  preserving  is  mixed  with  content  that  should  be  purged  

3.  Legal  -­‐-­‐  and  others  -­‐-­‐  are  afraid  of  wrongfully  deleLng  materials  (spoliaLon)  

4.  AddiLonal  storage  is  inexpensive,  which  makes  it  easy  for  corporaLons  to  buy  more  storage  and  defer  addressing  the  problem  

Page 6: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Issues  1.  The  problem  

§  The  sky  is  falling  again  

2.   Break  it  into  two  problems  §  Day-­‐forward  versus  historical  content  

3.  How  to  address  historical  content  §  A  defensible  disposi2on  methodology  

4.  Analysis  and  classificaLon  technology  §  Should  you  use  it?  Does  it  work?  

5.  Doing  the  Assessment  §  Approaches  and  results  

Page 7: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Recommenda7ons  for  Day-­‐forward  §  Addressing  day-­‐forward  informa7on  lifecycle  management  (ILM)  is  much  easier  to  address  than  

historical  content  §  Even  though  addressing  it  messes  with  employees’  day-­‐to-­‐day  business  acLviLes  

§  Day-­‐forward:  Ini2ate  ILM  prac2ces  on  a  “day-­‐forward”  basis  first,  so  any  new  content  created  or  saved  is  assigned  a  disposi2on  period  §  DisposiLon  horizons  should  begin  to  influence  behavior  on  where  content  begins  to  be  stored  (as  

users  discover  that  those  materials  saved  in  the  “wrong”  system  will  be  purged)  §  Guidance:  Provide  employees  with  explicit  guidance  for  the  acceptable  use  of  available  tools  for  

dynamic  content  and  their  associated  reten2on  periods    §  For  example,  retain  non-­‐records  for  3  years,  retain  official  records  per  the  retenLon  schedule  

§  Historical:  For  historical  content,  analyze  the  feasibility  of  content  analy2cs  and  autoclassifica2on  §  Recognize  that  cleaning  up  TBs  of  content  can  take  years.  So  conduct  the  analysis  in  2014,  begin  

the  cleanup  effort  in  earnest  by  2015,  and  eliminate  a  large  porLon  of  dated  content  by  2016    

Page 8: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Guidance  Example  for  Day-­‐forward  

System/Repository   Recommended  Reten7on  Period  

Personal  Network  Drives  (“P”  drives)  

•  Provide  each  user  with  personal  drive  space  of  a  limited  size  for  their  storage,  for  as  long  as  the  user  is  employed  

Shared  Network  Drives  (“G”  drives)  

•  Make  them  read  only  (which  means  no  network  storage  for  collabora7on;  content  will  have  to  go  into  an  ECM  system)  

•  Excep7ons  include  applica7on  or  systems  that  need  to  use  network  storage  

ECM  System   1.   Default  for  non  records:  retained  for  3  years    2.   Default  for  non  records  that  have    long-­‐term  value:  retained  for  7  years  3.   Official  records:  retained  per  the  reten7on  schedule  

Social  Community  Sites  

•  No  documents  stored  in  communi7es  (only  links  to  documents  in  the  ECM  system)  •  Consider  reten7on  periods  for  non-­‐document  content  (e.g.  3  years)  

Page 9: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Issues  1.  The  problem  

§  The  sky  is  falling  again  

2.  Break  it  into  two  problems  §  Day-­‐forward  versus  historical  content  

3.   How  to  address  historical  content  §  A  defensible  disposi;on  methodology  

4.  Analysis  and  classificaLon  technology  §  Should  you  use  it?  Does  it  work?  

5.  Doing  the  Assessment  §  Approaches  and  results  

Page 10: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

What’s  the  Purpose  of  Your  DD  Methodology?  

§  You  must  sa7sfy  4  demands:  1.   Regulatory  retenLon  requirements  2.   Hold  retenLon  requirements  3.   Business  retenLon  requirements  4.   Cost  impact  of  anything  you  do  

§  What  you  do  has  impact:  1.  What  you  do  2.  Effects  of  what  you  do  

§  You  can  do  2  things:  1.  Sort  2.  Dispose  

§  Your  mission  stated  two  ways:  §  Your  mission  is  to  saLsfy  your  retenLon  demands  (1-­‐3)  while  minimizing  bad  cost  impact  to  

yourself  (4)  §  Your  mission  is  to  maximize  good  cost  impact  (4)  while  saLsfying  your  retenLon  requirements  

(1-­‐3)    

Page 11: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

It’s  Based  on  Reasonableness  §  To  determine  what  “sa2sfy  your  reten2on  

demands”  really  means  for  you,  use  the  Principle  of  Reasonableness  and  act  In  Good  Faith  §  Courts  do  not  ask,  expect  or  necessarily  reward  organizaLons  for  

perfecLon.  Courts  do  expect,  however,  that  whatever  informaLon  management  tacLcs  an  organizaLon  undertakes  are  appropriate  to  how  that  parLcular  enLty  is  situated  (size,  financial  resources,  regulatory  and  liLgaLon  profile,  etc.).  (Jim  McGann  and  Julie  Colgan,  “Implement  a  defensible  dele2on  strategy  to  manage  risk  and  control  costs”,  Inside  Counsel)  

Page 12: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Your  DD  Methodology  Has  4  Parts  1.   Defensible  Disposi7on  Policy  

§  It’s  your  design  specificaLon,  your  business  rules  for  DD,  your  decision  tree  

§  Specifies  very  clearly  the  objecLves  that  your  methodology  will  fulfill.  It  states  clearly  what  you  mean  by  your  retenLon  requirements  and  what  you  mean  by  reasonable  costs  when  you  are  trying  to  fulfill  your  retenLon  requirements.  

2.   Technology  Approach  §  For  SorLng  and  Disposing  §  You  must  use  technology  –  it’s  not  an  opLon  

 

Page 13: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Your  DD  Methodology  Has  4  Parts  3.   Assessment    (Sor7ng)  Plan  

§  Do  the  legwork  and  look  at  what’s  there  §  What  informaLon  and  systems  you’re  assessing  §  Your  processing  rules    (decision  plan)  §  It  will  be  flexible  

4.   Disposi7on  Plan  §  Evaluate  your  assessment  results  using  your  DD  Policy  §  Dispose  (which  ranges  from  keeping  forever  to  deleLng  right  now  with  many  opLons  in  between)  

§  Refine  your  DD  Policy  (1)  and  conLnue  as  needed  

 

Page 14: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Issues  1.  The  problem  

§  The  sky  is  falling  again  

2.  Break  it  into  two  problems  §  Day-­‐forward  versus  historical  content  

3.  How  to  address  historical  content  §  A  defensible  disposi2on  methodology  

4.   Analysis  and  classifica7on  technology  §  Should  you  use  it?  Does  it  work?  

5.  Doing  the  Assessment  §  Approaches  and  results  

Page 15: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

There’s  an  Awesome  Business  Case  

Classifica7on  Technique   Classifica7on  Rate   Pricing   Total  Cost  

to  Classify  

Manual  ClassificaLon   10  seconds  per  document  

$35  /  hr.   $20  million  

Auto  ClassificaLon    (with  95%  machine  and  5%  human  classified,  via  offshore  labor)  

Less  than  1  second  per  document  

$.005  per  document  for  machine  processing  and    $5  /  hr.  for  those  that  require  manual  classificaLon    

$2  million  

§  …  if  the  technology  works  §  50  TB  =    ~200  million  documents  (average  of  250KB  per  document)  §  The  following  table  illustrates  the  Lme  and  effort  required  to  classify  200  million  documents  

Page 16: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Analysis  and  Classifica7on  Technologies  

§  Many  different  kinds  of  technology  vendors  are  addressing  analysis,  classificaLon,  and  disposiLon  §  File  AnalyLcs,  Content  AnalyLcs,  Content  ClassificaLon,  ECM,  E-­‐discovery,  

Search,  Capture,  DLP,  Storage  Management  §  Products,  hosted  soluLons,  service  providers    §  IBM/Stored  IQ,  HP/Autonomy,  EMC  Kazeon,  SAS,  Kofax,  Equivio,  RaLonal  

RetenLon,  Recommind,  Index  Engines,  and  others  §  Most  have  a  sweet  spot  where  they  will  succeed  

§  But  it’s  highly  dependent….  on  just  about  every  factor  you  can  think  of  §  E.g.,  your  business  purposes,  your  ECM  environment,  your  “informaLon  

architecture”,  your  document  types  and  their  complexity  and  volume,  the  value  and  risk  of  the  documents,  your  success  criteria,  etc.,  etc.,  etc.  

Page 17: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Sidebar:  How  Many  of  them  Work  

Before   Acer  

<server  XXX,  drive  G:>  Forecast  summary_121008.doc  

Record  =  no  Age  =  2.5  years  Document  type=  departmental  forecast  Keywords  =  forecast,  2008,  drav  Status  =  delete  

Confidence  =  9.2  (out  of  10)  

1.  Analyze  the  content  and  review  the  retenLon  schedule  2.  Establish  classificaLon  rules  and  train  the  systems  with  examples  

3.  Crawlers  and  recogniLon  engines  evaluate  the  content  and  generate  a  classificaLon  4.  For  content  where  a  high  machine  confidence  factor  exists,  content  is  automaLcally  tagged  

and  then  staged  for  migraLon  to  the  appropriate  system  or  disposiLon  

5.  For  content  with  low  confidence  factors,  documents  are  routed  to  clerical  staff  (onshore  or  offshore)  for  manual  classificaLon  

6.  The  results  of  the  manual  idenLficaLon  are  fed  back  into  the  automated  algorithms  to  “teach”  the  systems  bewer  classificaLon  

Throughout  the  process,  results  and  samples  are  routed  to  records  management  and  legal  professionals  within  the  firm  for  validaLon  and  confirmaLon  

1  2  

3  4  

5  

6    Client  

Valida7on    

Page 18: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Issues  1.  The  problem  

§  The  sky  is  falling  again  

2.  Break  it  into  two  problems  §  Day-­‐forward  versus  historical  content  

3.  How  to  address  historical  content  §  A  defensible  disposi2on  methodology  

4.  Analysis  and  classificaLon  technology  §  Should  you  use  it?  Does  it  work?  

5.   Doing  the  Assessment  §  Approaches  and  results  

Page 19: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Assessment  Approaches  §  There  are  three  categories  of  awributes  that  can  be  used  to  

determine  what  a  file  is:    1.   Environmental  awributes  around  the  file  (e.g.,  file  locaLon,  ownership)  2.   File  awributes  about  the  file  (e.g.,  file  type,  age,  author)  3.   Content  awributes  within  the  file  (e.g.,  keywords,  character  strings,  word  

proximity,  word  density)  §  Various  techniques    and  technologies,  along  with  business  rules,  

can  be  used  to  determine  what  a  file  is,  and  whether  it  is  eligible  for  disposiLon  §  E.g.,  a  DOC  file  created  over  5  years  ago  and  not  accessed  for  a  year  may  be  

purged  §  This  type  of  purging  could  be  done  aver  giving  users  adequate  noLce  (“move  it  

or  lose  it”  or  “hold”  for  90  days,  then  delete)  

Page 20: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

#1:  Environmental  Ahributes  

Ahribute   Evalua7on  Technique   Tool(s)  Used   Examples   How  Used  

Ownership   Access  Controls  

Content  Analy7cs,  Data  Loss  Preven7on,  Storage  Management  

Permissions  within  LDAP  list  people  and  infer  department  or  func7on  

Large  collec7ons  of  files  can  be  assessed  en  masse  based  on  access  controls  1  

Loca7on   File  Path  

Content  Analy7cs,  Data  Loss  Preven7on,  Storage  Management  

G:/accoun7ng/july2004/temp   Stranded  and  orphaned  loca7ons  are  ocen  easily  eliminated  2  

Environmental  Ahributes  (around  a  file)  

Page 21: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

#2:  File  Ahributes  

Duplicate  Hash  Algorithm   Content  AnalyLcs   Exact  duplicates   Exact  duplicates  can  be  easily  eliminated  

3  

File  Type   Extension  or  MIME  type   Content  AnalyLcs   .TMP,  .MP3   To  idenLfy  file  types  that  should  not  exist  in  a  corporate  seyng  

4  

Block  Read   Content  AnalyLcs   Near  duplicates   Near  duplicates  must  be  assessed  in  the  context  of  other  awributes  

Metadata   ProperLes  

Content  AnalyLcs   Age   To  determine  old  materials,  materials  authored  by  individuals  that  have  lev  the  organizaLon  

5   Content  AnalyLcs   Author   Typically,  these  awributed  must  be  conLnued  with  other  awributed  via  a  rule  to  take  acLon  

Content  AnalyLcs   Security  Profile  (ConfidenLal)   User  filename  properLes  to  determine  type  

File  Name   Character  Strings  

Content  AnalyLcs   GL-­‐USDIST31_093098.xls   Determine  whether  a  file  was  system  generated  vs.  human  generated  

6   Content  AnalyLcs   FORMUB92_SMITH   Documents  that  are  based  on  a  specific  form  number  can  easily  be  idenLfied  

Ahribute   Evalua7on  Technique   Tool(s)  Used   Examples   How  Used  

File  Ahributes  (about  a  file)  

Page 22: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

#3:  Content  Ahributes  

Key  Word   Character  Strings  

Content  AnalyLcs;  ClassificaLon  Module  

“Enron”,  “Guarantee”   To  determine  if  a  document  is  on  Hold  via  a  word  list  per  the  hold  request  

7  

Character  or  Word  Paherns  

“ClassificaLon”  <pawern  matching>  

ClassificaLon  Module   Word  proximity   To  determine  the  category  in  which  a  document  may  fit  8  

ClassificaLon  Module   Word  frequency  

Content  AnalyLcs;  ClassificaLon  Module  

“Privileged”   IdenLficaLon  of  PII  

Content  AnalyLcs;  DLP   SS#,  Credit  card  #   Regular  Expression(RegEX)  lists;  determined  enLLes  for  hold,  security,  IP,  PHI,  PII,  DLP  

Ahribute   Evalua7on  Technique   Tool(s)  Used   Examples   How  Used  

Content  Ahributes  (within  a  file)  

Page 23: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Assessment  Results  Preserva7on   Findings  

Unnecessary  File  Types  (Executables,  non-­‐business  pictures,  movies,  etc.)   13  to  15%  

Duplicates   15  to  20%  

Near  Duplicates   9  to  30%  

Risk   Findings  

Files  with  PII   10  to  16%  

Files  with  Sample  Keywords   3  to  5%  

Opera7onal   Findings  

Files  10  years  or  older   7  to  11%  

Files  accessed  within  the  last  18  months   25  to  35%  

Findings  not  mutually  exclusive  (  i.e.,  a  duplicate  file  could  also  be  aged)  

Page 24: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Assessment  Summary  

Findings   Enterprise  Impact  

Total  that  could  be  disposed   20%  of  2.5  PB  

Enterprise  ImplicaLons   .5  PB  removed  @  $5,000,000  per  PB  

Savings   $2,500,000  per  year  in  storage  expense  

Technique   Status   %  of  Total   Total  

AnalyLcs   Unnecessary     20%   500  TB  (.5  PB)  

ClassificaLon   Record   8%   200  TB  (.2  PB)  

Non-­‐Record,  Business  Reference  

28%   700  TB  (.7  PB)  

Evaluated,  Staged  for  DisposiLon  (2016)    

44%   1,100  TB  (1.1PB)  

Total   100%   2,500  TB  (2.5  PB)  

Page 25: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Assessment  Implica7ons  §  Given  the  results,  $2.5  million  in  storage  expense  could  be  saved  annually  on  the  disposiLon  of  

historic  content,  resulLng  in  $12.5  million  over  5  years  §  Going  forward  with  newly  created  content,  if  similar  techniques  are  applied,  the  saving  grows  to  

$34.8  million  over  5  years  §  The  current  cost  projecLons  are  based  on  the  historical  content  growth  rate  of  30%  per  year  §  The  expected  cost  projecLons  are  based  on  a  content  growth  rate  of  26%  per  year  

@$5,000,000  per  PB   2012   2013   2014   2015   2016*     Total  

Current  Storage  (PB)   2.5   3.25   4.23   5.49   7.14  

Current  Cost  (Mill)   $12.5   $16.3   $21.1   $27.5   $35.7   $113.0  

Expected  Storage  (PB)   2   2.52   3.18   4.00   3.94  Expected  Cost  (Mill)   $10   $12.6   $15.9   $20.0   $19.7   $78.2  

Total  Savings  (Mill)   $2.5   $3.65   $5.25   $7.46   $16.00   $34.8  

*In  2016,  the  1.1  PB  or  44%  of  content  from  the  2012  historical  content  assessment  can  be  disposed  

Page 26: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  

Conclusions  1.  The  business  case  for  disposiLon  is  strong  

§  Costs,  risks,  and  benefits  2.  InformaLon  governance  must  be  addressed  in  phases  

§  StarLng  today,  the  program  will  take  years  to  mature  §  Set  expectaLons  according  

3.  You  should  probably  address  day-­‐forward  ILM  before  tackling  historical  content  

4.  Recognize  that  manual  classificaLon  is  not  an  opLon  5.  The  technologies  are  immature  and  varied,  but  you  can  be  successful  by  

matching  the  techniques  and  technologies  to  the  kinds  of  files  you  want  to  target  

6.  Your  DD  methodology  has  4  main  parts:    DD  Policy,  Technology  Approach,  Assessment  Plan,  Disposi2on  Plan  

Page 27: The Good, The Bad, and The Ugly of Defensible Disposition

#AIIM14  #AIIM14  

#AIIM14  

Thank  You  Richard  Medina  

Co-­‐founder  and  Principal  Consultant,  Doculabs  |  doculabs.com  [email protected]  |  richardmedinadoculabs.com  

@richarddoculabs  

Page 28: The Good, The Bad, and The Ugly of Defensible Disposition

www.aiim.org/infochaos�  

Do  YOU  understand  the  business    challenge  of  the  next  10  years?  

This  ebook  from  AIIM  President  John  Mancini  explains.