Dear readers, these Data Warehousing Interview Questions have been designed especially to get you acquainted with the nature of questions you may encounter during your interview for the subject of Data Warehousing.
A : Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making process.
A : Subject oriented signifies that the data warehouse stores the information around a particular subject such as product, customer, sales, etc.
A : Some applications include financial services, banking services, customer goods, retail sectors, controlled manufacturing.
A : OLAP is an acronym for Online Analytical Processing and OLTP is an acronym of Online Transactional Processing.
A : A data warehouse contains historical information that is made available for analysis of the business whereas an operational database contains current information that is required to run the business.
A : A data Warehouse can implement star schema, snowflake schema, and fact constellation schema.
A : Data Warehousing is the process of constructing and using the data warehouse.
A : Data Warehousing involves data cleaning, data integration and data consolidations.
A : The functions performed by Data warehouse tool and utilities are Data Extraction, Data Cleaning, Data Transformation, Data Loading and Refreshing.
A : Data extraction means gathering data from multiple heterogeneous sources.
A : Metadata is simply defined as data about data. In other words, we can say that metadata is the summarized data that leads us to the detailed data.
A : Metadata respiratory contains definition of data warehouse, business metadata, operational metadata, data for mapping from operational environment to data warehouse, and the algorithms for summarization.
A : Data cube helps us to represent the data in multiple dimensions. The data cube is defined by dimensions and facts.
A : The dimensions are the entities with respect to which an enterprise keeps the records.
A : Data mart contains the subset of organization-wide data. This subset of data is valuable to specific groups of an organization. In other words, we can say that a data mart contains data specific to a particular group.
A : The view over an operational data warehouse is known as virtual warehouse.
A : The stages are IT strategy, Education, Business Case Analysis, technical Blueprint, Build the version, History Load, Ad hoc query, Requirement Evolution, Automation, and Extending Scope.
A : A load manager performs the operations required to extract and load the process. The size and complexity of load manager varies between specific solutions from data warehouse to data warehouse.
A : A load manager extracts data from the source system. Fast load the extracted data into temporary data store. Perform simple transformations into structure similar to the one in the data warehouse.
A : Warehouse manager is responsible for the warehouse management process. The warehouse manager consist of third party system software, C programs and shell scripts. The size and complexity of warehouse manager varies between specific solutions.
A : The warehouse manager performs consistency and referential integrity checks, creates the indexes, business views, partition views against the base data, transforms and merge the source data into the temporary store into the published data warehouse, backs up the data in the data warehouse, and archives the data that has reached the end of its captured life.
A : Summary Information is the area in data warehouse where the predefined aggregations are kept.
A : Query Manager is responsible for directing the queries to the suitable tables.
A : There are four types of OLAP servers, namely Relational OLAP, Multidimensional OLAP, Hybrid OLAP, and Specialized SQL Servers.
A : Multidimensional OLAP is faster than Relational OLAP.
A : OLAP performs functions such as roll-up, drill-down, slice, dice, and pivot.
A : Only one dimension is selected for the slice operation.
A : For dice operation two or more dimensions are selected for a given cube.
A : There is only one fact table in a star Schema.
A : Normalization splits up the data into additional tables.
A : Snowflake schema uses the concept of normalization.
A : Normalization helps in reducing data redundancy.
A : Data Mining Query Language (DMQL) is used for Schema Definition.
A : DMQL is based on Structured Query Language (SQL).
A : Partitioning is done for various reasons such as easy management, to assist backup recovery, to enhance performance.
A : Data Marting involves hardware & software cost, network access cost, and time cost.