Schema on Read

Why Trust Techopedia

What Does Schema on Read Mean?

Schema on read refers to an innovative data analysis strategy in new data-handling tools like Hadoop and other more involved database technologies. In schema on read, data is applied to a plan or schema as it is pulled out of a stored location, rather than as it goes in.

Advertisements

Techopedia Explains Schema on Read

Older database technologies had an enforcement strategy of schema on write—in other words, the data had to be applied to a plan or schema when it was going into the database. This was done partially to enforce consistency of data, and that is one of the major benefits of schema on write. With schema on read, the persons handling the data may need to do more work to identify each data piece, but there is a lot more versatility.

In a fundamental way, the schema-on-read design complements the major uses of Hadoop and related tools. Companies want to effectively aggregate a lot of data, and store it for particular uses. That said, they may value the collection of unclean or inconsistent data more than they value a strict data enforcement regimen. In other words, Hadoop can accommodate getting a wide scope of different little bits of data that might not be completely organized. Then, as that information is used, it gets organized. Applying the old database schema-on-write system would mean that the less organized data would probably be thrown out.

Another way to put this is that schema on write is better for getting very clean and consistent data sets, but those data sets may be more limited. Schema on read casts a wider net, and allows for more versatile organization of data. Experts also point out that it is easier to create two different views of the same data with schema on read.

This schema-on-read strategy is one essential part of why Hadoop and related technologies are so popular in today’s enterprise technology. Businesses are using large amounts of raw data to power all sorts of business processes by applying fuzzy logic and other sorting and filtering systems involving corporate data warehouses and other large data assets.

Advertisements

Related Terms

Margaret Rouse
Editor

Margaret jest nagradzaną technical writerką, nauczycielką i wykładowczynią. Jest znana z tego, że potrafi w prostych słowach pzybliżyć złożone pojęcia techniczne słuchaczom ze świata biznesu. Od dwudziestu lat jej definicje pojęć z dziedziny IT są publikowane przez Que w encyklopedii terminów technologicznych, a także cytowane w artykułach ukazujących się w New York Times, w magazynie Time, USA Today, ZDNet, a także w magazynach PC i Discovery. Margaret dołączyła do zespołu Techopedii w roku 2011. Margaret lubi pomagać znaleźć wspólny język specjalistom ze świata biznesu i IT. W swojej pracy, jak sama mówi, buduje mosty między tymi dwiema domenami, w ten…