Posts Tagged ‘Cloud Computing’

Data Integrity – The “Movable” Pillar of Discovery and Substantiation

Wednesday, April 21st, 2010

By Mark P. Dangelo

www.Innovative-Relevance.com

Integrity.  A character defining word that ranks with ethics, morals, and principles.  Its denotation affirms exceptional personal conduct, while its business implications suggest repeat customers, market status, and brand durability. 

More to the point, who can argue with the need for integrity, authentication, and chain of custody when it comes to financial and personal data that must be examined under scrutiny, forensics (e.g., fraud, transactional), or penalty of law? 

Whereas integrity principles are accepted as beneficial, it is in the achievement of these goals where money is lost and legal cases won – or lost.  There are cascading and hidden risks within, which are only surfaced when misfortune (e.g., delinquencies, foreclosures, class-actions, securitization failures) is internally recognized. 

Yet, for many individuals and organizations, the realization of integrity for data is something more mysterious.  It is commonly pushed deep into the enterprise — to the technologists in the back office.  Data integrity has little relevancy or correlation with today’s corporate strategies, operations, quality conformance, and profits.  Right? 

Fallacy and Reality

It seems every five to seven years, industry specialists and business leaders declare victory over the “hydra” of data integrity – classically defined as having three components of entity, referential, and domain.  With “victory” achieved, the organization and its focus shifts to the next problem or market challenge vexing its bottom line.  The data integrity requirements and regulatory mandates (e.g., business rules, data life-cycle, fail-safe controls) fade into the realm of IT myth and folklore.

However, with the steady advancement of technologies and practices (e.g., guaranteeing of data integrity in public cloud computing environments), the acceptance of demise for data integrity requirements creates false security – and lurking liabilities.  Like the hydra of mythology, the requirements compound and grow back (like the heads of the legendary beast gaining ferocity), becoming a menace to operational sustainability and business viability. 

With litigation and due diligence surrounding the data elementals of handling, storage, authenticity, durability, touchpoints, and isolation on the rise, it appears that hydra of integrity has found new life.  Some common data integrity misconceptions are frequently voiced as:

·         “Since our organization has structured application systems for our FMG (finance and mortgage group) operations, isn’t it a given that we  have the data integrity we are required to have for regulatory compliance?”

·         “We have standards for data entry fields, so why should we be concerned about the elements within the data repositories, marts, applications, and storage farms?  Isn’t data integrity really just about standards and field capture?”

·         “Data integrity is only about old application systems and approaches (e.g., flat files, VSAM, spreadsheets, point-based systems).  We have a commercial database and have spent years creating robust functionality in our origination and servicing systems.  Data integrity was taken care of years ago in our organization.  Sounds like much to do about nothing– like the IT department looking for a budget increase.”

Some of you reading this are probably just about ready to find another article that is more “edgy” or “important” to your organization.  Some would argue, this topic is old and stale and has very little to do with 3-years of housing’s turmoil and the current challenges facing FMG survival?  Let’s take a quick look at just a few of the realities documented by various organizations.

·         Annual price tag for bad loan data in the U.S. financial markets was as high as 7.3% of revenue – QAS Research, a unit of Experian,

·         FDIC reported that over 83% of the mortgages they audited contained violations,

·         Over 50% of the data corruption and integrity issues reside outside of technology – IBM, “Transforming Enterprise Information Integrity,” and

·         With over 90% of all records stored electronically (or scanned into electronic formats), the ability to maintain integrity over the life of the financial product (and for compliance) is material.

Moreover, with existing and pending legislation, additional concerns arise for the integrity of historical data stores and for future databases.  A very small snippet of these include (depending upon your business and model):

·         Consumer protection agency and its proposed charter,

·         Rule 803(6), U.S. Federal Rules of Evidence (see Info Law Group, “Privacy, Security, and Intellectual Property Law,” January 29, 2010),

·         “Skin-in-the-game” implications of Congressional financial bills (i.e., consequences of “cradle to grave” data life-cycle demands for definition, discovery, and defense), and

·         Existing and proposed state sponsored “breech” legislation – and consequences.

To believe that the guarantee of data integrity has been “met” across financial markets that are redefining rules of operation and conduct is fraught with peril.

A Principle-Driven Data Integrity Approach

For experienced enterprise architects, the utilization of principle-driven approaches is familiar – and represents stability and consistency for ever-increasing technological options.  Made famous in the “IT Paradigm Shift” by Don Tapscott and Art Caston, the use of the PRI (i.e., principle, rationale, and implication) architectural framework has gained global acceptance especially with the deployment of specialized technologies, layered outsourcing arrangements, and application compartmentalization. 

With estimates currently ranging from 4% to 9% of an IT’s budget directly or indirectly being consumed by information discovery, due diligence, and defense, the life-cycle challenge of data integrity cannot be left to chance. 

Table 1 provides an illustrative example for IT and business leaders of the granular and interdependent PRI’s, which are needed for the next decade.  So before you purchase the next origination system, sign up with a servicer, or restart private securitization efforts, ask yourself, “How are we addressing these areas, at what cost, and with what exposure?”

Table 1 – Illustrative PRI for Data Integrity / Data Architectures

Principle

Rationale

Implication

To guarantee adherence to organizational values, data custody and authentication must be able to be verified and certified by an objective and qualified third-party.

·          With the rapid adoption of cloud computing solution sets, data routing and its transmission cannot be assumed to be tamper proof.

·          Sequencing of data segments, since they may come from multiple sources and technology platforms, must involve the full acceptance of the data record or demand a complete rollback of all elements.

·          More than ensuring data and time stamps.

·          Demands more than standard hashing algorithms.

·          All data transmitted across public networks and from layered applications will be subject to predetermined certification tests.

·          Best practices will be adopted / modified to ensure data integrity of the highest quality.

·          On-going / continuous monitoring.

The validity of data and its consistency of capture, storage, and usage must conform to all published enterprise standards and applicable in-country regulations (e.g., privacy, confidentiality, et al).

·          Enterprise data demands must recognize disparate requirements, laws, and compliance demands when crossing local, state, federal, and international borders.

·          Within pending regulations, new implications of handling, storage, and usage will be demanded, with the “old” practices requiring a different set of data integrity “release” parameters.

·          If data is shipped in a cloud, its route should not violate laws on its path to corporate systems (a current challenge for network data routing in a cloud).

·          Compartmentalization of data management elements, schemas, and logical representations (along with all the rules and validation edits), should be maintained and able to be reconstructed for the life of the data.

Unless explicitly exempted by data owner, all data, rules, and manipulation must be implemented utilizing “data isolation modules” (DIM’s).

·          Identify and ensure fail-safe controls.

·          Ability to reconstruct “versioning” of data evolution at a discrete or elemental level.

·          Minimize reoccurring expenses associated with development and application segmentation.

·          DIM’s will have been validated and certified by custodian of record.

·          Taxonomies of data must be developed and maintained.

·          Any exception must be documented and sanctioned by corporate officer.

·          Use of (n)XML standards for data interchange and portability.

·          Common data models.

Data must be classified by both durability and risk to uphold usage, archival, and retirement.

·          With changes in technology solutions (i.e., technology is the least stable part of an architectural solution), portability and availability of data must be guaranteed.

·          A comprehensive data life-cycle approach for ensuring data integrity goes beyond application and database systems.

·          Part of a robust master data management (MDM) approach.

·          Years after the original transaction, due diligence demands will require a comprehensive review of all touchpoints including the outside review of litigants.

·          Workflows must reflect data requirements.

·          Rules engines do not singularly satisfy data integrity demands.

Data, and the transactional streams used during their life-cycle, must adhere to / exceed all auditing, logging, and compliance requirements.

·          With litigation on the rise, the “states” of data, its manipulation, and its usage must be captured across all its forms / transition states.

·          To support processes and regulatory demands, the data underlying the organizational certifications must be managed according to generally accepted practices that have been proven trustworthy in a court of law (i.e., they should be accepted by corporate legal advisors and auditors).

·          Use of robust WORM solutions according to performance, regulatory, and cost constraints.

·          All DIM’s and associated API’s must be certified.

·          Off-the-shelf certified solutions that incorporate industry standards, will be given preference during system “ever-greening” (to reduce capital costs, custom tailoring, while ensuring repeatability across multiple locations).

·          Monitoring and benchmarking of adherence will be done every (xxx).

Data sourcing, sequencing, and trustworthiness (SST), must be identified and maintained over the life-cycle of the data in adherence to corporate data standards and practices.

·          A clear system of record must be defined and maintained to guarantee efficacy and integrity.

·          With the data sourcing chain now firmly linked across origination, servicing, and securitization, access to the original data source (and the guarantee of its integrity) will be a top 5 corporate responsibility (and potential expense) during the next decade.

·          The use of “ACID” properties should be part of the certification process.

·          Use of existing and evolving data standards will be favored over internal, one-off solutions (to reduce costs, risks, and ETL requirements).

·          Developing and delivering data intelligence (DI) as part of standard operating procedures.

·          Strong and enforceable (with consequences) data polices.

NOTE: Illustrative and abbreviated for presentation purposes

 

After reviewing this lengthy table, some will ask, “Where’s the technology?  Where are the explicit standards?”  How can you have a data integrity architecture without solutions?” 

If we could add additional columns to the right, we would then add in the technology and discrete solution sets needed to deliver the principle-driven architecture that has been rationalized with its interdependencies.  So far, we’ve aligned organizational need, identified processes, and touched on the need for personnel and skills.

Perhaps the questions that make more sense for business personnel using this approach include, “What technologies are required to satisfy the business needs and operational requirements (inherent in the business and operational needs of the first three columns)?  What solutions best fit our ‘As-Is,’ ‘To-Be,’ and gap implementation programs of work for minimal capital costs and maximum return ensuring data integrity rigor / discipline?”

So yes, data technology and standards are very important – but only after the operating parameters have been articulated and approved.  They will vary for nearly every FMG driven by their management team, markets, and current challenges.  The interesting aspect of this proven approach is that once defined and maintained, it works in concert with numerous development or provisioning methods, applications needed, or outsourcer selected. 

In summary, the data integrity principles outlast the technologies, promote non-linear decision making, and “hold up” under the scrutiny of review.  For business leaders signing the checks of new solutions, it gives tremendous business case justification to “why IT does matter.”  And, when it comes to legal and regulatory challenges, preparation and anticipation are always cheaper than intrusive discovery and evidence gathering.

Challenges within Cloud Computing and Virtual Data Provisioning

Last, but not least, are the data integrity challenges materializing within cloud solution sets.  Since cloud computing and associated data storage options are one of the fastest growing offerings (e.g., Amazon, IBM, Dell) we cannot neglect the evolving issues of data integrity within the clouds.  (For an extensive discussion of cloud computing challenges and deployments, see The Alchemy of Creating Intelligence in “The Cloud”, by Mark Dangelo, October 2009). 

Data routing and storage within the cloud is one of the leading concerns facing publically provisioned cloud computing environments.  As the data is routed via a host of third-party, and country controlled telecommunication services, it can be exposed to fraud, corruption, sequencing challenges, and of course, privacy constraints. 

Additionally, given the types of interfaces and security mechanisms deployed across disparate computing platforms, the ability to introduce error and fraud has also increased when the flexibility of “pay-as-you-go” cloud computing is added.  Bottom line, there are exceptional benefits and values with cloud computing – but for today, organizations must consider exposure, risks, and consequences before placing mission critical or financial transactions over them.

The good news is that there are standards and practices that are being developed, which should be mentioned.  These include multiple standards from Object Management Group (OMG), Storage Network Industry Association (SNIA), Organization for the Advancement of Structured Information Standards (OASIS), and many others (for a comprehensive list see http://cloud-standards.org/wiki/index.php?title=Main_Page).

 

* * * * * * * *

 

As you can see, data integrity is a complex discussion that incorporates many horizontal and vertical disciplines in FMG’s, within computer science, auditing, and legal professions.  The final question that has yet to be asked is “How do I know if my organization has a data integrity problem?”  Well, if this is the first time you have asked that question for your enterprise….

 

For additional information and discussion on data integrity, please attend the “How Much Is Data Integrity Costing You?” conference session on April 26, 2010, 3:00 p.m., at the MBA’s National Technology in Mortgage Banking Conference.  For a complete list of session panelists and topic discussions see www.mbaa.org.

 

Snapshot - A Survey of Cloud Computing Analytics and Usage

Wednesday, December 2nd, 2009

Taking the pulse of markets and their participants

By Mark P. Dangelo

www.Innovative-Relevance.com

 

As the end of this decade draws to a close, there has been great talk in the media about the sesquicentennial publishing anniversary of Darwin’s Origin of Species.  Some refer to the “animal spirits” that are contained in the dealers of Wall Street, the industry moguls, and the activists, who are trying to tame an uncooperative world.  However, just like Darwin projections and the science around evolution, a new “technical animal” called cloud computing is changing its genetic structure every day. 

One thing this is very different moving forward with the birth of cloud solutions, is that CIO’s and CTO’s will be measured by business metrics – rather than overhead metrics of cost management and infrastructure spending. 

Additionally, there are two key trends that are rapidly expanding regarding the usage of cloud computing resources and on-going viability – services and “all-in-one” offerings. 

From the survey feedback, the use of services appears to be a key component and concern for many businesses and IT professionals.  Who to trust?  Are they knowledgeable?  What cost and on-going commitment is required? 

Regarding the “all-in-one” offerings, companies are impressed with the idea of a “one-stop-shop,” but are reluctant to embrace an all-or-nothing solution that appears on the surface to be expensive with considerable lock-in periods.  However, with an increasing number of vendors all providing hardware, software and services in an end-to-end bundle, the challenge for purchasers will be evaluating each on their merits efficiently aligned with corporate needs.  Specifically, only purchase what is needed and not pay for unused or unnecessary options.

The survey was constructed to focus on seven distinct areas of interest:

·         Enterprise and Department Usage

·         Belief in Existing Analytics

·         Importance of Existing Data Sources

·         Importance of Existing Analytics

·         Cloud Computing Challenges

·         Cloud Computing Acceptance

·         Cloud Computing Preparedness

Enterprise and Department Usage

Survey results can often confirm what you have expected or in some occasions, produce insights that shed light on emerging trends or organizational beliefs.  This on-going survey was no exception.

When asked if quantitative measurements were important to the enterprise, nearly 60%[1] of the respondents said they were high to critical, yet not quite 50% said they were effective.

Conversely, only 21% of the respondents when asked the same questions about their departments or divisions, said that quantitative measurements were effective, but more than twice as many said that these same ineffective measurements were high to critically important (44%). 

The implications of these results suggest that internal process measurements were not meeting the needs of the local departments / divisions, even though the demand for measurements was moderately high.  Moreover, these same individuals surveyed believed that the enterprise had more effective analytics and that they were almost 150% more effective than their own.

Belief in Existing Analytics

While the respondents firmly indicated that the organization as a whole was better off than their departments or divisions, their belief in the value of their analytical approaches was strong (see Figure 1). 

A deficiency identified with the existing analytics was their ability to provide predictive intelligence – only 14% thought that what they were doing was of high or critical importance. 

The only other challenge potential was the use of analytics to support the delivery of strategic goals or the achievement of operational strategies – 30% identified these as low or NS (not significant). 

Importance of Existing Data Sources

The importance of existing data within the organization for the most part was what analytical specialists would expect.  First, the use of spreadsheets remained a valuable source of analytical intelligence (see Figure 2).  Moreover, point based application systems continued to be the master source for many data analysis and synthesis operations to support extraction of information into the spreadsheets.

This series of questions clearly points to potential conflicts with the use of information and the subsequent manipulation of information by desktop toolsets (and the security, logic, and integrity within them). 

The surprise factor was the 86% moderate to critical importance placed on non-internal or third party data sources for analytical decision.  Clearly, information integration, archiving, and transformation have become a primary need within business and IT departments.

Importance of Existing Analytics

Whereas, current analytics and data sources were given high marks, their importance for various decision making or operational performance were varied (see Figure 3). 

For example, 77% of respondents clearly indicated that analytics for on-going improvements or quality of delivery were of moderate to critical importance.  Yet, only 71% said that the existing data and sources were important for risk analysis and/or mitigation. 

Puzzling was that only 37% who identified analytics as important for revenue or profit improvements given that margins are always measured.  This suggests a disjointed view and potential misuse of analytics across the enterprise.  Meaning, while the departments and divisions focus on exposure and improvements, they failed to see the potential direct correlation to organizational profits.  Striking still was the lack of moderate importance (just 6%) assigned to analytics for regulatory compliance.  The results were very strong (68%) that identified analytics as important for regulatory compliance but a high percentage (25%) indicated that analytics were low or non-significant for meeting regulatory demands. 

Cloud Computing Challenges

While the source and uses of existing analytics yielded a few surprises from the expectations, the introduction of cloud computing and the data sources it generates created some clear challenges (see Figure 4). 

The biggest surprise was the indication by both business and IT professionals that the introduction of cloud computing materially changes the future role of IT – nearly 78%. 

Equally insightful was the 80% of respondents that said the usage of cloud computing increased the risks of meeting regulator needs and agency guidance.

As expected, respondents expected data integration challenges with cloud computing – 29% indicating high to critical issues. 

What was expected, but also telling, was the 42% who said they expected high to critical security issues.  However, equally telling was the 29% who said security challenges within cloud computing were low or non-significant. 

Cloud Computing Acceptance

While the respondents were concerned with the use of cloud computing and meeting regulatory compliance, 50% also felt that it was high to critical in meeting oversight and governance needs (see Figure 5). 

Moreover, 72% believe that cloud computing would be of moderate to critical significance to meet changing consumer and business functionality in the timeframes demanded by the markets.  The respondents also stated that ROI of cloud computing was a major factor in its adoption, but 56% indicated that cloud computing was non-significant or of moderate importance for consumers or customers.

Cloud Computing Preparedness

Finally, the most foreboding measurements regarding cloud computing arrived in the area of organizational preparedness (see Figure 6). 

In every category the ability to perform and deliver on the promises and requirements of cloud computing garnered very substantial non-significant or low ratings.  Many times, this single category gained 50% of the responses.

Regarding the ability to address security challenges, only 17% said that their organization rated high to critical capabilities.

The skills demanded for data integration across the layer of cloud applications received only 24% in the high to critical range.  This alone signified a clear challenge and opportunity surrounding skills, standardization, outsourcing, and correlation of growing data sources provisioned outside the traditional intranets.  

Yet, while there were concerns surrounding data integration abilities, the use and deployment of analytics using cloud computing data sources increased by 3% to 27%.  This margin is not significant but it may point to a greater belief that once the data is properly integrated, the ability to summarize, augment, and transform raw fields will be easier for analytical personnel. 

Finally, when asked a non-specific question on the general cloud computing skill sets internally available, 28% of the respondents believed that their organizations had the necessary high or critical abilities to effectively implement cloud computing – its data, analytics, and security.

Taken separately, each cloud computing skill category performed poorer than the aggregation. 

In Summary

The snapshot of this survey clearly points to a belief that internal analytics apart from cloud computing are established and reasonably trusted.   However, there were clear areas of opportunity regarding their usage and robustness.

Additionally, when cloud computing principles and challenges were introduced, there was a material reduction in the comfort level associated with this rapidly evolving set of integrated technologies.  The most important clearly pointed to data integration and security protection. 

[1] Note, for simplicity of presenting the survey findings in this forum, all numbers were rounded to the nearest integer.