Data Liberation: Free Your Data From Vendor Lock-in & Gain Insights

@mrpaulandrew
Apr 15, 2024
2 min read

Let us help you break free from proprietary data formats and access valuable insights with open formats/standards using cloud agnostic tools.

Hi, I'm Paul and I write blogs to help process the thoughts in my head... :-)

AKA, the musings (post series link) of a slightly grumpy, battle hardened data engineer, technology strategist and enterprise architect.

Context

What is data liberation and why does it matter? The text book answer: data liberation is the process of converting your data from vendor-specific formats to open and interoperable standards. The goal is to allow you to access, analyse and share your data with any tool or platform, without being restricted by the limitations or costs of proprietary database software. Based on this, getting the AI image generator to create something showing a 'database in chains' seemed to work well, with the additional prompt of rusty chains, seen on the right. Nice!

For us, this is important because it gives you (the business) more control, flexibility and value from your data. You can use the best tools for your needs, whether they are open source, commercial or otherwise. Ultimately avoiding vendor lock-in, which can limit your options, increase your running costs and reduce your innovation. As a side benefit, collaboration with others becomes much easier, meeting people on their terms, with their preferred tools.

Challenges

Without naming names just yet. Data liberation from a software vendor is not always easy or straightforward. We’ve seen lots of issues over the years, such as:

Lack of documentation or support from the vendor on how to export or convert your data.
Complexity or inconsistency of the data format, which may require custom scripts or tools to transform.
Loss of functionality or quality of the data, such as metadata, formatting, or calculations, during the conversion process.
Security or compliance risks, such as exposing sensitive or confidential data to unauthorized access or breach.

For most, these challenges can derail or delay projects or in the worst cases stop you altogether.

How

Watch this space, we have something cooking in the Cloud Formations kitchen! One of our key focuses in the industry for the coming year is to support customers with this challenge. That said, simple data migration can be boring, therefore we are taking a slightly different AI driven approach, covering not just the data, but metadata and your legacy/locked in code as well. Data liberation, you could say, is just the beginning.

Metadata liberation.
Code liberation.
Requirements liberation.
Value liberation.

... now we are getting to an acceptable point in our technology strategy.

In all cases, as a starting point in the stack, our target open-standard technologies for building data products will be Apache Spark compute (probably implemented using Databricks) and Data Lake storage, structured using Delta Lake.

We have the expertise and experience, we are now building the tools to help you free your data from vendor lock-in and move it to these open-standards.

Many thanks for reading.