Data warehousing interview questions and answers pdf

 
    Contents
  1. Data Warehouse Interview Questions
  2. Data Warehousing Interview Questions And Answers You Must Prepare In 2019
  3. Data Warehouse Interview Questions
  4. The Best Data Warehouse Interview Questions [UPDATED]

Top 50 Data Warehouse Interview Questions & Answers. last updated March There are four stages of Datawarehousing: . Download PDF. All Rights Reserved. lycgodoomcari.gq INDEX. Data Warehousing Interview Questions and Answers. 1. Data Warehousing Interview Q and Answers . 2. This is a top data warehouse interview questions and answers that can help you crack your data warehousing job interview. You will learn about difference.

Author:DONTE KACHELMEYER
Language:English, Spanish, Hindi
Country:Brunei
Genre:Academic & Education
Pages:329
Published (Last):27.12.2015
ISBN:760-8-31217-155-9
Distribution:Free* [*Sign up for free]
Uploaded by: MYONG

49333 downloads 88137 Views 24.51MB PDF Size Report


Data Warehousing Interview Questions And Answers Pdf

These Data Warehousing interview questions and answers on data warehousing concepts will get you your dream Data Warehousing job in. Global Guideline. lycgodoomcari.gq Data Warehousing Interview Questions And Answers G Data Warehousing Job Interview Preparation Guide. Click here to get free chapters (PDF) in the mailbox What are slowly changing SQL SERVER – Data Warehousing Interview Questions and.

Multi-dimensional business tasks What is ODS? ODS is abbreviated as Operational Data Store and it is a repository of real time operational data rather than long term trend data. What is the difference between View and Materialized View? A view is nothing but a virtual table which takes the output of the query and it can be used in place of tables. A materialized view is nothing but an indirect access to the table data by storing the results of a query in a separate schema. What is ETL? ETL is a software which is used to reads the data from the specified data source and extracts a desired subset of data. Next, it transform the data using rules and lookup tables and convert it to a desired state. Then, load function is used to load the resulting data to the target database. What is VLDB? These are decision support systems which is used to server large number of users. What is real-time datawarehousing? Real-time datawarehousing captures the business data whenever it occurs.

Fact is key performance indicator to analyze the business. Dimension is used to analyze the fact. Without dimension there is no meaning for fact. Static variable is not created on function stack but is created in the initialized data segment and hence the variable can be shared across the multiple call of the same function. Usage of static variables within a function is not thread safe. On the other hand, local variable or auto variable is created on function stack and valid only in the context of the function call and is not shared across function calls.

What Is A Source Qualifier? When you add a relational or a flat file source definition to a mapping, you need to connect it to a Source Qualifier transformation. The Source Qualifier represents the rows that the Informatica Server reads when it executes a session. Data Mining is used for the estimation of future. Traditional approaches use simple algorithms for estimating the future.

However, it does not give accurate results when compared to Data Mining. View - store the SQL statement in the database and let you use it as a table. Every time you access the view, the SQL statement executes. Materialized view - stores the results of the SQL in table form in the database.

SQL statement only executes once and after that every time you run the query, the stored result set is used. Pros include quick query results. Both differed in the concept of building the data warehouse. According to Kimball, Kimball views data warehousing as a constituency of data marts. Data marts are focused on delivering business objectives for departments in the organization. And the data warehouse is a conformed dimension of the data marts. Hence, a unified view of the enterprise can be obtained from the dimension modeling on a local departmental level.

Inmon beliefs in creating a data warehouse on a subject-by-subject area basis. Hence, the development of the data warehouse can start with data from the online store. Other subject areas can be added to the data warehouse as their needs arise. Point-of-sale POS data can be added later if management decides it is necessary.

What Is Junk Dimension? Junk dimension: Grouping of Random flags and text attributes in a dimension and moving them to a separate sub dimension. Degenerate Dimension: Keeping the control information on Fact table ex: Consider a Dimension table with fields like order number and order line number and have 1: So whenever we have the keys in a table.

That it implies that the table is in the normal form. Basic difference is E-R modeling will have logical and physical model. Dimensional model will have only physical model. What Is Conformed Fact? Conformed dimensions are the dimensions, which can be used across multiple Data Marts in combination with multiple facts tables accordingly.

Every company has methodology of their own. What Is Bus Schema? BUS Schema is composed of a master suite of confirmed dimension and standardized definition if facts. What Is Data Warehousing Hierarchy? Hierarchies are logical structures that use ordered levels as a means of organizing data.

A hierarchy can be used to define data aggregation. For example, in a time dimension, a hierarchy might aggregate data from the month level to the quarter level to the year level.

A hierarchy can also be used to define a navigational drill path and to establish a family structure. Within a hierarchy, each level is logically connected to the levels above and below it. Data values at lower levels aggregate into the data values at higher levels.

A dimension can be composed of more than one hierarchy. For example, in the product dimension, there might be two hierarchies--one for product categories and one for product suppliers. Dimension hierarchies also group levels from general to granular. Query tools use hierarchies to enable you to drill down into your data to view different levels of granularity. This is one of the key benefits of a data warehouse. When designing hierarchies, you must consider the relationships in business structures.

Hierarchies impose a family structure on dimension values. For a particular level value, a value at the next higher level is its parent, and values at the next lower level are its children. These familial relationships enable analysts to access data quickly. Data validation is to make sure that the loaded data is accurate and meets the business requirements.

Strategies are different methods followed to meet the validation requirements. Three different data types: Dimensions, Measure, and DetailView is nothing but an alias and it can be used to resolve the loops in the universe.

What Is Surrogate Key? Where We Use It? Explain With Examples. Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table.

The only requirement for a surrogate primary key is that it is unique for each row in the table. Data warehouses typically use a surrogate, also known as artificial or identity key , key for the dimension tables primary keys. It is useful because the natural primary key i.

Customer Number in Customer table can change and this makes updates more difficult.

What Is A Linked Cube? Linked cube in which a sub-set of the data can be analyzed into detail. The linking ensures that the data in the cubes remain consistent. Metadata is also presented at the Datamart level, subsets, fact and dimensions, ODS etc. What Is Dimensional Modeling? Dimensional Modeling is a design concept used by many data warehouse designers to build their data warehouse. Dimension modeling is a method for designing data warehouse. Three types of modeling are there.

The perception of what constitutes a VLDB continues to grow. A one-terabyte database would normally be considered VLDB. Degenerate dimension: What Is Degenerate Dimension Table? Degenerate Dimensions: If a table contains the values, which r neither dimension nor measures is called degenerate dimensions.

For example invoice id, employee no. A degenerate dimension is data that is dimensional in nature but stored in a fact table. What Is Er Diagram? The Entity-Relationship ER model was originally proposed by Peter in [Chen76] as a way to unify the network and relational database views.

Simply stated the ER model is a conceptual data model that views the real world as entities and relationships. A basic component of the model is the Entity-Relationship diagram, which is used to visually represent data objects. Since Chen wrote his paper the model has been extended and today it is commonly used for database design for the database designer, the utility of the ER model is: The constructs used in the ER model can easily be transformed into relational tables. It is simple and easy to understand with a minimum of training.

Therefore, the database designer to communicate the design to the end user can use the model. In addition, the model can be used as a design plan by the database developer to implement a data model in specific database management software. Also Queries results fast. Snowflake schema is the normalized form of Star schema. It contains in-depth joins, because the tables are spited in to many pieces. There will be some delay in processing the Query.

Yes, dimensions even contain numerical because these are descriptive elements of our business. It may happen that in a table, some columns are important and we need to track changes for them i. You can add that it is not an intelligent key but similar to a sequence number and tied to a timestamp typically!

You can have only one clustered index per table. If you use delete command, you can rollback If you do not want records, you may use truncate command, which will be faster and does not fill your redo log file. In DWH loops may exist between the tables. If loops exist, then query generation will take more time, because more than one path is available.

It creates ambiguity also. Loops can be avoided by creating aliases of the table or by context. Create alias for the cost to avoid loop. Error Log in Informatica is a one of output file created by Informatica Server while running the session for error messages.

It is created in Informatica home directory. Integrated schema design is also used to define an integrated schema design we have to define the following concepts.

A fact constellation is the process of joining two or more fact tables B: A fact table with out any facts is known as fact less fact table C: A dimension which is re useful and fixed is known as conformed dimensionA dimension, which is, shared with multiple fact tables known as conformed dimension.

What Is Drilling Across? Drill across corresponds to switching from 1 classification in 1 dimension to a different classification in different dimension. In Business Objects Universe Designer you can open Table Browser and select the tables needed then insert them to designer. Where The Cache Files Stored?

What Is Dimension Modeling? A logical design technique that seeks to present the data in a standard, intuitive framework that allows for high-performance access. However, popular are ER and DM only. What Is Data Cleaning? How Can We Do That?

Data cleaning is a self-explanatory term. Most of the data warehouses in the world source data from multiple systems - systems that were created long before data warehousing was well understood, and hence without the vision to consolidate the same in a single repository of information. In such a scenario, the possibilities of the following are there:. In order to ensure that the data warehouse is not infected by any of these discrepancies, it is important to cleanse the data using a set of business rules, before it makes its way into the data warehouse.

In Data warehousing, levels are columns available in dimension table. Levels are having attributes. Hierarchies are used for navigational purpose; there are two types of Hierarchies. You can define hierarchies in top down or bottom up. Natural Hierarchy: In natural Hierarchy definite relationship exists between each level 2. Navigational Hierarchy: Ex - Lead Time defined to procure, Actual Procurement time, In this, two levels need not to have relationship.

This Hierarchy is created for navigational purpose. Dirty Dimension is nothing but Junk Dimensions. Core Dimensions are dedicated for a fact table or Data mart. Conformed Dimensions are used across fact tables or Data marts. Universe does not hold any data. However, practically the universe is known to have issues when the objects cross What Is Core Dimension? Core Dimension is a Dimension table, which is used dedicated for single fact table or Datamart.

Conform Dimension is a Dimension table which is used across fact tables or Data marts. Informatica filter transformation default value is 1 i. If you place a break point on filter transformation and run the mapping in a debugger mode, you will find these values 1 or 0 for each row passing through filter.

If you change 0 to 1, the particular row will be passed to next stage. What Is Galaxy Schema? Galaxy schema is also known as fact constellation scheme.

It requires no of fact tables to share dimension tables. In data, wares housing mainly the people are using the conceptual hierarchy. Data warehouse is made up of many datamarts. DWH contain many subject areas. However, data mart focuses on one subject area generally. If there will be DHW of bank then there can be one data mart for accounts, one for Loans etc. This is high-level definitions. What Is Meta Data? Metadata is data about data. What Is Data Mart? A data mart only contains the required subject specific data for local analysis.

A database, or collection of databases, designed to help managers make strategicdecisions about their business. Some data marts, called dependent data marts, are subsets of larger data warehouses. A data mart is a simpler form of a data warehouse focused on a single subject or functional area such as sales, finance, marketing, HR etc. Data Mart represents data from single business process. Datamart is subset of Datawarehouse we can say a datamart is collection of individual departmental information Where as datawarehouse in collection of datamart.

What Are Data Marts. Data Mart is a segment of a data warehouse that can provide data for reporting and analysis on a section, unit, department or operation in the company, e. Data marts are sometimes complete individual data warehouses which are usually smaller than the corporate data warehouse.

A data mart is a subject oriented database which supports the business needs of individual departments within the enterprise. It is an subset of the enterprise data warehouse. It is also known as high performance query structures. Data validation is generally done manually in DWH in this case if source and TGT are relational you need to create SQL scripts to validate source and target data and if source isFlat file or non relational database you can use excel if data is very less or create dummy tables to validate your ETL code.

What Is Data Mining? Data Mining is the process of analyzing data from different perspectives and summarizing it into useful information. A database structure that is a repository for near real-time operational data rather than long term trend data.

The ODS may further become the enterprise shared operational database, allowing operational systems that are being re-engineered to use the ODS as there operation databases. ETL is abbreviation of extract, transform, and load.

The data can come from any source. ETL is powerful enough to handle such data disparities. First, the extract function reads data from a specified source database and extracts a desired subset of data. Next, the transform function works with the acquired data — using rules orlookup tables, or creating combinations with other data — to convert it to the desired state.

Finally, the load function is used to write the resulting data to a target database. OLTP database tables are normalized and it will add additional time to queries to return results.

Additionally OLTP database is smaller and it does not contain longer period many years data, which needs to be analyzed. Foreign keys of facts tables are primary keys of Dimension tables. It is clear that fact table contains columns which are primary key to other table that itself make normal form table. What Are Lookup Tables? A lookup table is the table placed on the target table based upon the primary key of the target, it just updates the table by allowing only modified new or updated records based on thelookup condition.

What Are Aggregate Tables? Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of dimensions. It is always easy to retrieve data from aggregated tables than visiting original table which has million records.

Aggregate tables reduces the load in the database server and increases the performance of the query and can retrieve the result quickly. What Is Real Time Data-warehousing? Data warehousing captures business activity data. Real-time data warehousing captures business activity data as it occurs. As soon as the business activity is complete and there is data about it, the completed activity data flows into the data warehouse and becomes available instantly.

What Are Conformed Dimensions? Conformed dimensions mean the exact same thing with every possible fact table to which they are joined. They are common to the cubes.

Time dimensions are usually loaded by a program that loops through all possible dates that may appear in the data. Level of granularity means level of detail that you put into the fact table in a data warehouse.

Level of granularity would mean what detail are you willing to put for each transactional fact. What Are Non-additive Facts? Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. However they are not considered as useless. If there is changes in dimensions the same facts can be useful. What Is Factless Facts Table? A fact table which does not contain numeric fact columns it is called factless facts table.

Explain About Olap? OLAP is known as online analytical processing which provides answers to queries which are multi dimensional in nature. It composes relational reporting and data mining for providing solutions to business intelligence. Hyper cube or multidimensional cube forms the core of OLAP system.

This consists of measures which are arranged according to dimensions. Dimensions are extracted from dimension table and measures from the fact table. Explain About Molap? Simple database structures such as time period, product, location, etc are used.

Functioning of each and every dimension or data structure is defined by one or more hierarchies. Explain About Rolap? Data and tables are stored as relational tables.

To hold new information or data new tables are created. Explain About Aggregations? OLAP can process complex queries and give the output in less than 0. Aggregations are built by aggregating and changing the data along the dimensions.

Possible combination of aggregations can be determined by the combination possibilities of dimension granularities. Often calculating all the data is not possible by aggregations for this reason some of the complex data problems are solved. In order to determine which data should be solved and calculated, developers use View selection application. This solution is often used to reduce calculation problem.

Bitmaps are very useful in start schema to join large databases to small databases. Answer queries and bit arrays are used to perform logical operations on the databases. Bit map indexes are very efficient in handling Gender differentiation; also repetitive tasks are performed with much larger efficiency. Bitmaps commonly use one bitmap for every single distinct value. Number of bitmaps used can be reduced by opting for a different type of encoding. Space can be optimized but when a query is generated bitmaps have to be accessed.

Explain About Binning? Binning process is very useful to save space. Performance may vary depending upon the query generated sometimes solution to a query can come within few seconds and sometimes it may take longer time.

Binning process holds multiple values in the same bin. Explain About Candidate Check? The process which is underlined during the check of base data is known as candidate check. When performing candidate check performance varies either towards the positive side or to the negative side.

Performance of candidate check depends upon the user query and also they examine the base data. Explain About Hybrid Olap? When a database developer uses Hybrid OLAP it divides the data between relational and specialized storage.

In some particular modifications a HOLAP database may store huge amounts of data in its relational tables. Specialized data storage is used to store data which is less detailed and more aggregate.

After which XML was used for analysis specification and this specification was largely used by many vendors throughout the world as a standard specification. Shared implements most of the security features into OLAP. If multiple accesses are required admin can make necessary changes. The default security level for all OLAP products is read only.

For multiple updates it is predominant to make necessary security changes. Explain About Analysis? Analysis defines about the logical and statistical analysis required for an efficient output. This involves writing of code and performing calculations, but most part of these languages does not require complex programming language knowledge. There are many specific features which are included such as time analysis, currency translation, etc.

Multidimensional support is very essential if we are to include multiple hierarchies in our data analysis. Multidimensional feature allows a user to analyze business and organization. OLAP efficiently handles support for multidimensional features. Database marketing tool or application helps a user or marketing professional in determining the right tool or strategy for his valuable add campaign.

This tool collects data from all sources and gives relevant information the specialist with their add campaign. It gives a complete picture to the developer. Many different companies can use this tool for developing their business strategy but it is often three major industries which use this tool more. Those three industries are Consumer goods industries, Retail industries, and financial services industry. The data warehouse and the OLTP data base are bothrelational databases.

However, the objectives of both these databases are different. The OLTP database records transactions in real time and aims to automate clerical data entry processes of a business entity. Addition, modification and deletion of data in the OLTP database is essential and the semantics of the applicationused in the front end impact on the organization of the data in the database. The data warehouse on the other hand does not cater to real time operational requirements of the enterprise.

It is more a storehouse of current and historical data and may alsocontain data extracted from external data sources. What Are Various Olap Tools? ETL is a extraction,transformation,loading tool i. OLAP is online analytical process, where u can get online reports after doing some joines,creating some cubes. To do thisconfigure the desired Security section in the Cognos Configuration.

Data Warehouse is used for business measures cannot be used to cater real time business needs of the organizationand is optimized for lot of data, unpredictable queries. On the other hand, OLTP database is for real time business operations that are used for a common set of transactions.

Data warehouse does not require any validation of data. OLTP database requires validation of data. In simple terms, level of granularity defines the extent of detail. As an example, let us look at geographical level of granularity. The star schema is created when all the dimension tables directly link to the fact table.

Since the graphical representation resembles a star it is called a star schema. It must be noted that the foreign keys in the fact table link to the primary key of the dimension table. The fact table contains foreign keys that link to the dimension tables. What Is Fact Table? Fact Table contains the measurements or metrics or facts of business process. If your business process is "Sales" , then a measurement of this business process such as "monthly sales number" is captured in the Fact table.

Fact table also contains the foriegn keys for the dimension tables. What Is A Data Warehouse? Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated…. This makes it much easier and more efficient to run queries over data that originally came from different sources.

Typical relational databases are designed for on-line transactional processing OLTP and do not meet the requirements for effective on-line analytical processing OLAP. As a result, data warehouses are designed differently than traditional relational databases.

While ER model lists and defines the constructs required to build a data model, there is no standard process for doing so. Why Is Data Modeling Important? Data modeling is probably the most labor intensive and time consuming part of the development process.

Why bother especially if you are pressed for time? A common. On the fact table it is best to use bitmap indexes.

Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others.

For example:. They are often used to record events or coverage information. Explain Degenerated Dimension. A Degenerate dimension? Dimension which has only a single attribute. The data items thar are not facts and data items that do not fit into the existing dimensions are. Degenerate Dimensions are the fastest way to group similar transactions. Degenerate Dimensions are used when fact tables represent transactional data.

Data Warehouse Interview Questions

They can be used as primary key for the fact table but they cannot act as foreign keys. Conventional Load: Before loading the data, all the Table constraints will be checked against the data.

Direct load: Faster Loading All the Constraints will be disabled. Data will be loaded directly. Later the data will be checked against the table constraints and the bad data won't be indexed. What Are Slowly Changing Dimensions? Dimensions that change over time are called Slowly Changing Dimensions. For instance, a product price changes over time; People change their names for some reason; Country and State names may change over time.

These are a few examples of Slowly Changing Dimensions since some changes are happening to them over a period of time. If the data in the Dimension table happen to change very rarely,then it is called as slowly changing dimension. MS-Excel 2. Business Objects Crystal Reports 3. Cognos Impromptu, Power Play 4. Microstrategy 5. MS reporting services 6. Informatica Power Analyzer 7. Actuate 8. Hyperion BRIO 9. And alias is different from view in the universe.

View is at database level, but alias? Metadata or Meta Data Metadata is data about data. The repository environment encompasses all corporate metadata resources: Metadata includes things like the name, length, valid values, and description of a data element.

Metadata is stored in a data dictionary and repository. It insulates the data warehouse from changes in the schema of operational systems. Metadata Synchronization The process of consolidating, relating and synchronizing data elements with the same or similar meaning from different systems.

Metadata synchronization joins these differing elements together in the data warehouse to allow for easier access. The basic purpose of the scheduling tool in a DW Application is to stream line the flow of data from Source To Target at specific time or based on some condition. Surrogate Key is an artificial identifier for an entity. They do not describe anything.

Primary Key is a natural identifier for an entity. In Primary keys all the values are entered manually by the user which are uniquely identified. There will be no repetition of data. If a column is made a primary key and later there needs a change in the data type or the length for that column then all the foreign keys that are dependent on that primary key should be changed making the database Unstable.

Surrogate Keys make the database more stable because it insulates the Primary and foreign key relationships from changes in the data types and length. What Is Snow Flake Schema? Snowflake schemas normalize dimensions to eliminate redundancy. That is, the dimension data has been grouped into multiple tables instead of one large table. While this saves space, it increases the number of dimension tables and requires more foreign key joins. The result is more complex queries and reduced query performance.

Find where data for this dimension are located. Figure out how to extract this data. Determine how to maintain changes to this dimension. Change fact table and DW population routines. Data Mining is used for?

For example,? Traditional approches use? But, it does not give accurate results when compared to Data Mining.

No Tool testing in done in DWH, only manual testing is done. Give Examples Of Degenerated Dimensions. Degenerated Dimension is a dimension key without corresponding dimension. But in this? Normally Surrogate keys are sequencers which keep on increasing with new records being injected into the table. The standard datatype is integer.

If your source is a cobol copybook, then we have a command in unix which generates the required in Ab Initio. Rollup is for group by and Scan is for successive total. Basically, when we need to produce summary then we use scan. Rollup is used to aggregate data. EME is a repository in Ab Inition and it used for checkin and checkout for graphs also maintains graph version. What Is Brodcasting And Replicate?

Broadcast - Takes data from multiple inputs, combines it and sends it to all the output ports. Replicate - It replicates the data for a particular partition and send it out to multiple out ports of the component, but maintains the partition integrity. Eg - Your incoming flow to replicate has a data parallelism level of 2. Now suppose you have 3 output flos from replicate.

Two are graph level parameters but in local you need to initialize the value at the time of declaration where as globle no need to initialize the data it will promt at the time of running the graph for that parameter.

Here is a simple example to use a start script in a graph: In start script lets give as: Now somewhere in the graph transform we can use this variable as; out. How Does Maxcore Works? Maxcore is a value it will be in Kb.

Whne ever a component is executed it will take that much memeory we specified for execution. XFR represent the tranform functions. The latest version of GDE ism1. Which we can make use in parallel unloads. It also specifies which table to use for the parallel clause. To convert 4 way to 8 way partition we need to change the layout in the partioning component. There will be seperate parameters for each and every type of partioning eg. The appropriate parameter need to be selected in the component layout for the type of partioning.

Have You Used Rollup Component? Describe How? If the user wants to group the records on particular field values then rollup is best way to do that. Rollup is a multi-stage transform function and it contains the following mandatory functions. For each of the group, first it does call the initialise function once, followed by rollup function calls for each of the records in the group and finally calls the finalise function once at the end of last rollup call.

Wheras the primary key table is the parent table and foreignkey table is the child table. The criteria for both the tables is there should be a matching column. What Is An Outer Join? An outer join is used when one wants to select all the records from a port - whether it has satisfied the join criteria or not.

What Are Cartesian Joins? A Cartesian join will get you a Cartesian product. A Cartesian join is when you join every row of one table to every row of another table. You can also get one by joining every row of a table to every row of itself.

Main Purpose of Stored Procedure for reduse the network trafic and all sql statement executing in cursor so speed too high. Due to the heavy modification activity the execute plan becomes outdated and hence the stored proc performance goes down.

If we create the stored proc with recompile option, the sql server wont cache a plan for this stored proc and it will be recompiled every time it is run. What Is A Cursor? The oracle engine uses work areas for internal processing in order to the execute sql statement is called cursor.

There are two types of cursors like Implecit cursor and Explicit cursor. Implicit cursor is using for internal processing and Explicit cursor is using for user open for data required. There are several ways to do this: It is a DDL command, used to delete tables or clusters. Since it is a DDL command hence it is auto commit and Rollback can't be performed. It is faster than delete. It is DML command, generally used to delete a record, clusters or tables.

Rollback command can be performed , in order to retrieve the earlier deleted things. To make deleted things permanently, "commit" command should be used. If you are trying to install the Ab -Initio on stand alone machine , then it is not necessary to create the repository , While installing It creates automatically for you under abinitio folder where you installing the Ab-Initio If you are still not clear please ask your Question on the same portal.

Explain plan can be reviewed to check the execution plan of the query. This would guide if the expected indexes are used or not. Do you mean by merging Gui map files in WR.

So it is impossible to run a file by merging 2 GUI map files. Because every job depend upon another job for example if you first job result is successfull then another job will execute otherwise your job doesn't work.

By using rollup we cant generate cumulative summary records for that we will be using scan. Basically,This is a part of D.

Because implicit is using for internal processing and explicit is using for user open data requied. Dependency analysis will answer the questions regarding datalinage. That is where does the data come from,what applications prodeuce and depend on this data etc.

The design method consists of two major phases. During the first phase, you create the underlying database structure of your universe. This structure includes the tables and columns of a database and the joins by which they are linked.

You may need to resolve loops which occur in the joins using aliases or contexts. You can conclude this phase by testing the integrity of the overall structure.

Data Warehousing Interview Questions And Answers You Must Prepare In 2019

During the second phase, you can proceed to enhance the components of your universe. You can also prepare certain objects for multidimensional analysis. As with the first phase, you should test the integrity of your universe structure. You may also wish to perform tests on the universes you create from the BusinessObjects User module.

Finally, you can distribute your universes to users by exporting them to the repository or via your file system. For a universe based on a simple relational schema, Designer provides Quick Design, a wizard for creating a basic yet complete universe.

You can use the resulting universe immediately, or you can modify the objects and create complex new ones. In this way, you can gradually refine the quality and structure of your universe. What Are Universe Requirements? At-least one object in the class must be present in the other class so that they can have a join n afcourse the datatypes.

How To Create It? Measure objects are the objects which have facts i. Database In Use Is Oracle? For E. Business Objects deals with databases, for every universe you need to specify a database connection.

We have securities in business objects Like 1. Windows authentication 2. RDBMS securities 3. How To Create Context? Creating a filter in designer is different from creatind a filter in business object if u create a filter in designer it can acessible to all the reports ur r using i'e,it can used for further applications where as creatin a filter vin business object is dynamic run time it will applicable to only tht particular report.

How I Do This? We can set the predefined condition in the universe level. In the task bar of the designer there is an icon for filtering. Just click on the icon. It asks for the condition name. After giving the name for the condition type the sql for that condition. Guidelines are provided in the Universe guide: Here is the Summary: Using quick design wizard for developing the universe will invoke the built-in strategy.

What Is Pragma? Index awareness is the ability to take advantage of the indexes on key columns to speed data retrieval. Right click on any object,go to the properties.

Aggregate awareness is a term that describes the ability of a universe to make use of aggregate tables in a database. These are tables that contain precalculated data. Speed up the execution of query and Improve the performance of Sql transaction.

If you are using the aggregate tables then you must refresh the aggregate table with all fact tables to have the consistency in your result. A business object can be used to represent entities of the business that are supported in the design.

Data Warehouse Interview Questions

A business object can accommodate data and the behavior of the business associated with the entity. Business objects in business intelligence are entities of the business. What Is A Universe?

A universe connects the client to the data warehouse. It is a file defining relationships amongst the tables in the warehouse, classes and objects, database connection details. What Is Bomain. A BOMain. It contains the address of the repository security domain. The file is sotred in the LOCData folder. What Is Object Qualification?

Object qualification is an attribute of an object that helps to determine how it can be used in multidimensional analysis. Using this, the multidimensional analysis objects can either be qualified as dimension, detailed and measure. Security level used in BO: Both the security levels are handled by the administrator of the tool. On the other hand, Web intelligence reports needs a browser and a URL of the server from where Business objects will be accessed.

This is not required for Web Intelligence Reports. Web Intelligence reports can be accessed from anywhere, provided internet is available. Once the information is sent, it is validated and checked into the repository upon which the user can access the BO services. Batch processing can be used to schedule reports. Objects can be also be used for batch processing. Batch processing can be used to also select the objects to be processed.

The batch can be run as a transaction in which if one process fails, the entire batch is rolled back or it can be run as a series of jobs. Security domain in business objects is a domain containing all security information like login credentials etc.

It checks for users and their privileges. This domain is a part of the repository that also manages access to documents and functionalities of each user. What Is Data Cardinality? Cardinality is the term used in database relations to denote the occurrences of data on either side of the relation. There are 3 basic types of cardinality: High data cardinality: Values of a data column are very uncommon. What are Aggregate tables? Aggregate tables are the tables which contain the existing warehouse data which has been grouped to certain level of dimensions.

It is easy to retrieve data from the aggregated tables than the original table which has more number of records. This table reduces the load in the database server and increases the performance of the query. What is factless fact tables? How can we load the time dimension? Time dimensions are usually loaded through all possible dates in a year and it can be done through a program. Here, years can be represented with one row per day. What are Non-additive facts? Non-Addictive facts are said to be facts that cannot be summed up for any of the dimensions present in the fact table.

If there are changes in the dimensions, same facts can be useful. What is conformed fact? What is Datamart? A Datamart is a specialized version of Datawarehousing and it contains a snapshot of operational data that helps the business people to decide with the analysis of past trends and experiences. A data mart helps to emphasizes on easy access to relevant information. What is Active Datawarehousing? An active datawarehouse is a datawarehouse that enables decision makers within a company or organization to manage customer relationships effectively and efficiently.

Datawarehouse is a place where the whole data is stored for analyzing, but OLAP is used for analyzing the data, managing aggregations, information partitioning into minor level information. What is ER Diagram? ER diagram is abbreviated as Entity-Relationship diagram which illustrates the interrelationships between the entities in the database. This diagram shows the structure of each tables and the links between the tables.

What are the key columns in Fact and dimension tables? Foreign keys of dimension tables are primary keys of entity tables. Foreign keys of fact tables are the primary keys of the dimension tables. What is SCD? SCD is defined as slowly changing dimensions, and it applies to the cases where record changes over time.

The Best Data Warehouse Interview Questions [UPDATED]

What are the types of SCD? What is BUS Schema? BUS schema consists of suite of confirmed dimension and standardized definition if there is a fact tables. What is Star Schema? Star schema is nothing but a type of organizing the tables in such a way that result can be retrieved from the database quickly in the data warehouse environment.

What is Snowflake Schema? Snowflake schema which has primary dimension table to which one or more dimensions can be joined. The primary dimension table is the only table that can be joined with the fact table. What is a core dimension? Core dimension is nothing but a Dimension table which is used as dedicated for single fact table or datamart. What is called data cleaning?

Name itself implies that it is a self explanatory term.

Related:


Copyright © 2019 lycgodoomcari.gq. All rights reserved.