MDP Based Approach for Data Cleansing and Importing at Runtime
Keywords:
Data Cleansing, Data importing, Business Intelligence, Elixir framework, MDP, Enterprise Applications, cost control, declarative languages, intelligent language, polymorphic table, cascade delete, , balance design patternAbstract
This article presents a new methodology that uses a new models based on MDP (Model Driven Programming), which allows us to build user friendly and efficient ETL/data cleansing framework that functions in live environment. It enables business experts to import and clean data themselves, as it does not require any background in programming or database management.
Data Importing and Cleansing is the most sensitive and expensive step in enterprise application development (such as e-government systems, ERPs, forex, banking, HMIS ...) and when it fails the entire project fails.
Our methodology is capable of detecting errors early, even before the actual importing of data. It can check not only integrity errors but semantic errors also (as if we have already imported our data to the live system).
Moreover, our methodology allows us to roll back the imported data, any time, without restoring the whole database (ie., users can keep any record created after the importing process and remove only data related to the imported data). This makes our model appropriate to import data to live systems.
We have implemented our approach using a MDP framework (Elixir) which is based on free/open source infrastructure including J2EE, JBoss, MySQ.
The final system is tested and validated within a real “cost control” project, which was applied for the medical services in the Ministry of Health (MoH) in Syria. The first test is achieved in the biggest hospital belonging to MoH (Mujtahed Hospital), then in Eye hospital and the team of MoH continue the work by himself in all other hospitals.