[WEBINAR] The New Normal: Dealing with the Reality of an Unsecure World

Schema on Read

Definition - What does Schema on Read mean?

Schema on read refers to an innovative data analysis strategy in new data-handling tools like Hadoop and other more involved database technologies. In schema on read, data is applied to a plan or schema as it is pulled out of a stored location, rather than as it goes in.

Techopedia explains Schema on Read

Older database technologies had an enforcement strategy of schema on write—in other words, the data had to be applied to a plan or schema when it was going into the database. This was done partially to enforce consistency of data, and that is one of the major benefits of schema on write. With schema on read, the persons handling the data may need to do more work to identify each data piece, but there is a lot more versatility.

In a fundamental way, the schema-on-read design complements the major uses of Hadoop and related tools. Companies want to effectively aggregate a lot of data, and store it for particular uses. That said, they may value the collection of unclean or inconsistent data more than they value a strict data enforcement regimen. In other words, Hadoop can accommodate getting a wide scope of different little bits of data that might not be completely organized. Then, as that information is used, it gets organized. Applying the old database schema-on-write system would mean that the less organized data would probably be thrown out.

Another way to put this is that schema on write is better for getting very clean and consistent data sets, but those data sets may be more limited. Schema on read casts a wider net, and allows for more versatile organization of data. Experts also point out that it is easier to create two different views of the same data with schema on read.

This schema-on-read strategy is one essential part of why Hadoop and related technologies are so popular in today's enterprise technology. Businesses are using large amounts of raw data to power all sorts of business processes by applying fuzzy logic and other sorting and filtering systems involving corporate data warehouses and other large data assets.

Techopedia Deals

Connect with us

Techopedia on Linkedin
Techopedia on Linkedin
"Techopedia" on Twitter

Sign up for Techopedia's Free Newsletter!

Email Newsletter

Join thousands of others with our weekly newsletter

Free Whitepaper: The Path to Hybrid Cloud
Free Whitepaper: The Path to Hybrid Cloud:
The Path to Hybrid Cloud: Intelligent Bursting To Amazon Web Services & Microsoft Azure
Free E-Book: Public Cloud Guide
Free E-Book: Public Cloud Guide:
This white paper is for leaders of Operations, Engineering, or Infrastructure teams who are creating or executing an IT roadmap.
Free Tool: Virtual Health Monitor
Free Tool: Virtual Health Monitor:
Virtual Health Monitor is a free virtualization monitoring and reporting tool for VMware, Hyper-V, RHEV, and XenServer environments.
Free 30 Day Trial – Turbonomic
Free 30 Day Trial – Turbonomic:
Turbonomic delivers an autonomic platform where virtual and cloud environments self-manage in real-time to assure application performance.