When we are going down the road of e-mail archiving for regulatory reasons is there a thing as 100% complete, or is 5 - 9’s accuracy good enough?
What is the consequence for correct and complete as it relates to data archival systems deployed for compliance, legal discovery and ultimately corporate risk management.
If you are challenged with the requirement for capturing all electronic communications (inbound/outbound) what mechanism, process or IT controls can help you to achieve 100% accuracy or better put - can guarantee both complete and correct results? Even in the best case scenario can you ever really account for technology outages, software bugs (that can occur at every layer below the application, firmware for example on a hardware device), or even good old fashion human error. Although the vision of an archival system gives you the impression that you are dealing with a very static durable almost simple piece of technology that houses long-term data - nothing could be further from the truth.
The reality is that archival systems technology has evolved more over the last two years, than the entire previous decade - which included the internet, TCP/IP, local and wide area networking and more? Hard to imagine, but true.
This has been due to the regulatory shift that has occurred over the last few years, as a result of e-mail being required for long term retention (archival) in the financial services sector - all due to corporate malfesance. The technology shift and subsequent consequence for archival systems has been in having to extend the arms of the archive and reaching out to actively participate in the details of - how - the data is captured from it’s source systems. Traditionally, archives relied on the source system placing content some place where it could be picked up and then archived.
{A lot like dropping a laundry bag at the back door and having someone from the cleaners come by and pick it up}
Well imagine if the laundry pick up service didn’t have the bag of laundry waiting for them and instead they had to open the front door to the house and every house in the neighborhood, and deal with every type of security system in order to enter. The comparison is the archival system knocking on the door of many different systems (i.e. MS Exchange 2000, 2003, Lotus Notes 5,6,6.5, IM (instant messaging systems) ERP, and integrating into each distinct security system from the application to the OS, plus centralized directory sources. Now, can you imagine the laundry pickup service person walking through your home and locating all of the dirty laundry, opening up every draw, closet, and so on… Then drawing conclusions on each piece of laundry; which need light starch, heavy starch and so on…
The picture is pretty insane, but that is essentially the extended role of the todays archival technologies and more importantly - its required - if they are to remain both viable and competitive in solving problems for businesses - especially in this climate of heightened awareness of corporate content, security, audit requirements and the overall pertinence to protecting an organization reputation and profit.
With all of the moving parts involved in todays processes for archival technologies its critical that these parts are all part of a single comprehensive system - the alternative is an EAI approach and dealing with connecting these moving parts across different technology vendors. This is a task that even in the best case scenarios leaves major exposure with respect to both completeness and correctness of the overall system. If we evaluate completeness and correctness and 100% through several nines of reliability/accuracy what are we really talking about?
Well at a very basic level if your organization message traffic is 5mil messages/day completeness would account for capturing and archiving each of the 5mil messages transmitted each day. At the end of 5 years your archive would contain 5mil*264 (business days)*5 =6,600,000,000 (over 6.5 billion records).
If your ability to deliver completeness ranked at 99.9% reliability your 5 year archive, used and relied on for compliance and legal discovery, would be missing over 6.5 million records.
{Can you afford to knowingly miss access to 6.5million records?}
This is the overall importance of completeness. An EAI approach is not the answer for delivering complete record archival, the disconnect between systems present risk for errors and inability to manage processes end-to-end; a single flexible platform gives you all of the tools and redundancies needed to achieve to 100% completeness.
A modular approach through a single platform to the archival process is paramount, as logging can be done at each module which provides the ability to systematically reconcile and rollback in the event of process exemptions. Correctness is yet another area of major consequence that is often ovelooked - and it has to be combined with completeness in order to deliver reliable data archival. Data has to be more than just successfully captured, indexed and archived, it must be processed correctly.
{Indexing errors have consequences on correctness, while you have the "complete" record it may not find it's way into a result set if the index was not created in it's entirety}
A simple example can be made with record categorization. If a rule is applied incorrectly at the program; level, then data can be lost down-stream, accidentally deleted, not found or even not provided in the case of legal discovery (when it should have), etc. A more complex example can be an indexing operation, the details of indexing are complex, what happens if the indexing operation of an attachment containing 100 pages of MS Word fails to index pages 99-100?
The record, while complete is not 100% correct. The consequences for correctness is high.
The audit logging of events such as capture, indexing, combining, categorizing and nearly every discreet process involved in the overall archival operations must be capable of system logging.
And as important, if not more so, these log files must be monitored and automated to deliver alerts and triggered events so systems personnel can be alerted to and participate in re-repocessing information, while maintaining the full integrity of the data and the correction processes, when errors in key processes occur.
Few vendors have grown their systems organically and are delivering the required capabilities end-to-end through a single system approach this is the better and less error prone approach needed in todays business climate for both technical and business reasons - and most importantly combining both correct and complete.

Comments