Big Data Needs Big Content

February 24, 2014 by Jennifer Cobb

A recent report from AIIM and IBM on measuring the ROI of Big Data and content analytics indicates that many organizations remain too immature in their content management efforts to be able to include critical unstructured and free form text data in their big data projects.

Why?  65% of organizations have “disorganized content.”  At the same time, 62% of organizations say that they would find content analytics to be “very valuable.”  The biggest business value would be improving data quality, detecting policy compliance and speeding up customer service.  These are not necessarily big data issues, but simply the kinds of operational capacities that are high on the wish list for many organizations – capacities that rely on capturing semi-structured content.

The findings in this report underscore what we at Captricity hear all the time – data capture from paper documents remains difficult, expensive and error-prone.   Running sophisticated analytics on “big content” remains out of reach for most organizations.  When critical data is missing from analytics and monitoring systems, things get missed.  As the world races toward a more sophisticated, data-rich environment, those missing elements will be a liability.

The following chart offers a good sense of what is missing from analytics systems.  The green lines, which dominate most content types, remain aspirational for many organizations.

Chart 8 AIIM Big Data

If organizations could get easy access to this content, what would they do with it?  The following chart offers some clues.  As you can see, many organizations would like to include the content in data sets for querying, running analytics and improving their governance and management practices.

Chart 9 -- AIIM Big Data

There remain several hurdles before organizations can begin to bring big content into big data and ongoing operational support.


  • Data quality – Most organizations continue to rely on data capture solutions that provide output that is not operationally ready.  Organizations spend many additional person hours on data QA, a cost that quickly becomes prohibitive when dealing with large content streams
  • Privacy and security – For many of the organizations, this was a show-stopper.  The ability to protect personally identifiable information (PII) , financial information, legal and medical records was paramount.
  • Capture for handwritten documents –  Many organizations have high value information contained in handwritten documents, including incident reports, claims, and comment fields in feedback forms.  This content is considered “an intractable issue” for many automated systems.


At Captricity, we address all of these concerns.  Our 100% HIPAA-compliant solutions is helping hundreds of organizations to unlock high quality data, securely, from paper forms.  We have worked to clear backlogs of reports, compliance and regulatory forms and managed ongoing workflows of critical lead forms, customer support information and much more.  We achieve 99%+ quality on handwriting and human marks of all types.

To learn more about what we do, click here.  Or, better yet, sign up for a free trial today!

Stay up to date!

Sign up for our newsletter today to discover how Captricity can help you unlock valuable customer data—from handwritten forms and scans to faxes, emails, and mobile inputs.