Digitalization: innovating data management

by Michele Dallorto
Apr. 21, 2020
6 minutes

The topic of document digitization is of paramount importance for public administration and businesses alike. This is because business opportunities have arisen thanks to technological developments, but often business operators and public officers have not been able to cope and integrate with technological evolution. The result of this inability is an out-of-sync situation where the technological infrastructure deals with societies and businesses that are still very much paper-based and are struggling with the digital shift of data management. Similarly, the majority of legally binding documents were created before the advent of the digital age.

Digitization of documents optimizes process management and data analysis making possible the enhancement of data quality and data redemption. WizKey firmly believes in the employment of new technologies in the field of data management, it represents a key factor in the success of any kind of business because data quality largely affects profitability, especially when considering structured finance.

New technologies offering new possibilities

The digitation of documents represents the starting point of innovative new solutions that are now available thanks to the adoption of innovative technologies. Electronic archiving, immediate fruition and data query of physical documents have greatly advanced and are now widely used by businesses, research centers or public administrations. The employment of a combination of technologies in the field of data management is providing better capability for organizations to extrapolate and elaborate data from physical documents. In this perspective, attention on technologies applied in the process of extrapolation and valorization of data contained in paper documents is well deserved. Technologies such as Optical Character Recognition (OCR) and Artificial Intelligence (AI) have evolved significantly and today’s solutions cover a wide range of price and quality requirements. Their application in the financial sector has improved the quality and value of data in a considerable way allowing the achievement of better result when classifying documents, this process allows higher pricing for those assets that usually still rely on physical legal documentation. The automatic process of acquiring the meaning of texts, images, and tables from unorganized documents opens up valuable possibilities: firstly, tasks that previously required manual labor can now be automated; secondly, new possibilities for analytics are created at a high level of detail thanks to a smart linking with the business’ own data processing model or other elaboration software.

The process of extrapolation of organized data from scans of physical documents can be broken down into three steps:

  1. From paper copy to digital scan. Documents inevitably need to be scanned to create a digital file. Scanned documents can be spread and can be viewed almost immediately thanks to the internet. However, these files are still of limited use because the data contained in them are available as images only.
  2. From image file to intelligible PDF. We proceed to the transformation of the image file into an intelligible PDF (Portable Document Format) through OCR. In this automated process, letters and numbers are recognized as such, enabling a keyword search within the document. However, the manual task of “copying and pasting” of individual account numbers or other specific data remains. The relevant information is still stuck in the documents and cannot be retrieved automatically so far.
  3. From intelligible PDF to smart document. The decisive factor for automated recognition of unstructured data is whether the machine can figure out information within its context. To empower automated recognition, therefore creating a smart document from an intelligible PDF, we need to implement interaction between humans and machines on two different levels.

Empowers humans: In the first step, the machine prepares the document so that the previously flat representation becomes hierarchically structured. The textual part of the document is broken down into individual letters and numbers (so-called glyphs), enriched with information on the position, color, font, etc., and then reassembled into words, sentences, and entire text sections. Other content, such as images, spreadsheets, or vector graphics, undergoes similar structuring processes. The document, therefore, transforms into a structured data tree that the machine now wants to understand.

Empowering machines: In a second step, the user gives meaning to the elements by telling the machine what information is relevant and how to retrieve it. This is done either with traditional rule-based approaches, such as keywords or the position in the document. Alternatively, the use of modern machine learning algorithms enables the machine to understand semantic content in a cognitive way.

Data management with WizKey

WizKey offers a data management solution that integrates all the technologies mentioned before, the product is specifically designed to tackle peculiar issues of our customers. We closely collaborate with our clients and we understand the problems they must face. Management of paper documents is one of those and it is affecting the value of the assets whose property legal status is represented by qualifying paper documents. When considering NPL (Non-Performing Loans) for instance, it is quite common to deal with piles of paper documents representing the legal backbone of the loans themselves. The presence of consistent quantities of paper-work makes the process of due diligence expansive and time consuming, especially when considering that this procedure recurs every time a portfolio is transferred or sold because both parties need to ensure the integrity of paper documentation. As a matter of fact, documents are lost frequently and when this occurs the results are discrepancies between the due diligence of the documental set performed by the originator and that performed by the buyer. Missing documents can represent valuable claims for the buyer to request considerable discounts for the portfolio during the negotiation process.

WizKey Define is a platform specifically designed for structured finance and it works on multiple layers, collecting inputs from multiple sources. This synergy between the components of the platform creates a perfectly integrated solution that allows the immediate availability of the digital format of the document as well as a double check scheme providing data correctness. The onboarding of the PDF scans and the filling of the data onto the right tabs visible in the platform is performed through automated upload from data tape, at that point the OCR system enters into action and performs a comparison between data extrapolated from the digitized documents and data onboarded on the platform. Whenever differences are detected the platform informs the user that can proceed with the correction by using the software and its intuitive user interface.

A valuable add-on of WizKey Define is the legal recognition and validation of documents uploaded.  WizKey is in exclusive  partnership with a pool of notaries with expertise in blockchain technology. That ensures that qualifying PDFs of an asset are legally binding in digital formats as well. Every time a user uploads a document on the platform, the digital copy uploaded in PDF is validated, ensuring originality and legal validity of PDFs.

The platform WizKey Define

All the data management operations such as porfolio creation, onboarding of documents and research for discrepancies are easily performed by the operator throught our platform WizKey Define: the platform features a user interface that allows the execution of clear and highly intuitive operations.

Screenshot of the WizKey platform login page

The operator intuitively creates portfolios and loads the belonging digitized files. This operation is carried out through automated data tape onboarding. Alternately, the drag-and-drop upload of previously digitized documents is available as well.

Screenshot of the WizKey platform new portfolio page

Once the credits have been successfully uploaded to the platform, along with the relative digitized PDFs,  the OCR system double-checks searching for discrepancies between the data already extrapolated and available in the platform and the actual data present on the original paper documents. The operator receives notifications from the OCR whenever errors are noticed, he/she can then easily correct these errors by browsing the tabs in the platform and adjusting the specific terms.

Screenshot of the WizKey platform documents digitalization page


New possibilities made possible be the employment of innovative technologies have a positive impact on process management, business performance, research & development but also in data quality and data valorization. The optimization offered by these new technological tools is further amplified by the possibility to employ them through software that is operated remotly thus ensuring flexibility and ease of use for operators. Moreover, when considering businesses that are moving quickly to adapt to remote working conditions, our product nimbly fits in as a solution for our clients that are increasingly relying on digital data management to collect, review and share business-critical documents with board members, council, advisors, and audit teams. The data management model offered by WizKey Define matches the highest standards for intelligibility, user experience and interaction with OCR and AI proving itself as the solution of choise for those operators willing to optimize and innovate their data management.

