Please leave your valuable comments so that we can improve.
1. What is Index? What are the types of Index? Please Explain.
Ans. Index is a faster way of doing lookup with data based on the values within those columns.
If we create an index on the primary key and then search for a row of data based on one of the primary key values, SQL Server finds that value in the index and then uses the index to locate the entire row.
An Index is made up of a set of pages(index nodes) that are organized in a b-tree hierarchical structure, starting from the root node and ending at the leaf nodes. Each node contains a double linked list.
If we create an index on the primary key and then search for a row of data based on one of the primary key values, SQL Server finds that value in the index and then uses the index to locate the entire row.
An Index is made up of a set of pages(index nodes) that are organized in a b-tree hierarchical structure, starting from the root node and ending at the leaf nodes. Each node contains a double linked list.
We cant create indexes on LOB data types such as Image,Text or nvarchar(max).
Clustered Index :-contains the actual data rows at the leaf node of the Index. Index values are always sorted in ascending or descending order. That's why only 1 clustered index is possible.
Non Clustered Index :-contains a reference to the index in case of a clustered table or a rowlocator in case of heap(non clustered view or table).Max no in case of SQLserver 2005 :-249,Max no in case of SQLServer 2008 :-999
A Non-Clustered Index you will use alongside a clustered index to improve read performance. You should put these on columns used in WHERE Clause, JOIN, etc. and can include other columns that the query needs, to enhance performance even more. This is called a "covering index".
Disadvantage:-Slows down the Insert, Update or Delete. Non Clustered will occupy more disk space.
2. What is table Partition?
Ans. When a database table grows in size to the hundreds of gigabytes or more, it can become more difficult to load new data, remove old data, and maintain indexes. Just the sheer size of the table causes such operations to take much longer. Even the data that must be loaded or removed can be very sizable, making INSERT and DELETE operations on the table impractical. The Microsoft SQL Server 2008 database software provides table partitioning to make such operations more manageable.
Partitioning a large table divides the table and its indexes into smaller partitions, so that maintenance operations can be applied on a partition-by-partition basis, rather than on the entire table. In addition, the SQL Server optimizer can direct properly filtered queries to appropriate partitions rather than the entire table.
Benefits :-Performance,Parallelism
Right Click the table ,Storage,Create Partition,Create Partition Wizard will appear,Next,Give Partition Function name,Give Partition Scheme name,Mapping,option to create script or run or Schedule
Window Functions provide results for each row in the dataset.For Eg : Row_Number(),Rank() and Dense_Rank().
Group by provides one aggregated value for one dataset but Partition by provides value for each of the row in the data set.
3. What is JOIN in SQL Server?
Ans. Join is the process of joining two or more tables or views on the basis of a relationship to get a desired output result set.
It's of three types:
1. Inner Join
2. Outer Join
a. Left outer Join
b. Right outer Join
c. Full outer Join
Inner Join: The rows of the tables those passing the join condition will be the output of the Inner Join.
Outer Join:
Left Outer Join: It will give all the records of the Left table and the only those rows from the right table that pass the Join condition.
Right Outer Join: It will give all the records of the Right table and only those tables from the left table that pass the Join condition.
Full Outer Join: It will give all the records from Right table and Left table.
4. How to delete duplicate rows in SQL Server?
Ans. Two ways to delete duplicate rows in SQL Server:
1. Using %%physloc%% pseudo column.
The %%physloc%% pseudo column holds the physical location of the row. you can take the max of the column and delete the rest.
2. Using Common Table Expression(CTE)
Create a CTE on the table and create a row number column for the table grouping by all columns. Then, take the max row number row alone and delete the rest.
5. What is Common Table Expression(CTE)?
Ans. Common Table Expression or CTE allows us to write a query and name it using an alias and later refer the same data using the alias name just as a table. The changes made to the alias are reflected in the base table.
Advantages:-
1)When lot of Sub-queries are involved, then CTE can put each sub-query as a temp table and these temp tables can be referred afterwards.
2)Readability
3)Recursive Query can be achieved
3)Calculated Fields like Row number, Rank and Dense Rank can be stored in CTE.
4)When we don't want a view to be in Metadata, we can use CTE.
5)Looping
Syntax:
With Temp_Table_Name (Col1,Col2) as (Query_String)
After the above statement, we can write a query referring to the above Temp_Table_Name.
Left Outer Join: It will give all the records of the Left table and the only those rows from the right table that pass the Join condition.
Right Outer Join: It will give all the records of the Right table and only those tables from the left table that pass the Join condition.
Full Outer Join: It will give all the records from Right table and Left table.
4. How to delete duplicate rows in SQL Server?
Ans. Two ways to delete duplicate rows in SQL Server:
1. Using %%physloc%% pseudo column.
The %%physloc%% pseudo column holds the physical location of the row. you can take the max of the column and delete the rest.
2. Using Common Table Expression(CTE)
Create a CTE on the table and create a row number column for the table grouping by all columns. Then, take the max row number row alone and delete the rest.
5. What is Common Table Expression(CTE)?
Ans. Common Table Expression or CTE allows us to write a query and name it using an alias and later refer the same data using the alias name just as a table. The changes made to the alias are reflected in the base table.
Advantages:-
1)When lot of Sub-queries are involved, then CTE can put each sub-query as a temp table and these temp tables can be referred afterwards.
2)Readability
3)Recursive Query can be achieved
3)Calculated Fields like Row number, Rank and Dense Rank can be stored in CTE.
4)When we don't want a view to be in Metadata, we can use CTE.
5)Looping
Syntax:
With Temp_Table_Name (Col1,Col2) as (Query_String)
After the above statement, we can write a query referring to the above Temp_Table_Name.
6. How to send Multivalued Parameter to a procedure from
SSRS?
Ans. Parameter -Multivalued,In the dataset,Stored Procedure
to be chosen from drop-down,Click Parameters,Go to expression and type
join(Parameter,delimiter).In the SP,we have to use a table valued split
function after IN.
7. 1 million records are coming from Source.after 10000
records not matching in the look up,we want to fail the package.
Ans. Use Row Count
and FailPackageonFailure Expression on the task
8. How to delete duplicate records in Sql Server 2008?
Ans. %%physloc%% is a pseudocolumn that gives the physical
location of every row(or ROW_ID).
With temp table name will re-generate the same table with
additional column, which is row number. In our case, we have Col1 and Col2 and
both the columns qualify as duplicate rows. It may be a different set of rows
for each different query like this. Another point to note here is that once CTE
is created DELETE statement can be run
on it. We will put a condition here – when we receive more than one rows of
record, we will remove the row which is not the first one. When DELETE command
is executed over CTE it in fact deletes from the base table used in CTE.
10. What is Common Table Expression(CTE) ?
Ans. Common Table Expression or CTE allows us to write a
query and name it using an alias and later refer the same data using the alias
name just as a table.The changes made to the alias are reflected in the base
table.Advantages:-
1)When lot of Subqueries are involved,then CTE can put each
subquery as a temp table and these temp tables can be referred afterwards.
2)Readability
3)Recursive Query can be achieved
3)Calculated Fields like Rownumber,Rank and Dense Rank can
be stored in CTE.
4)When we dont want a view to be in Metadata,we can use CTE.
5)Looping
With Temp_Table_Name (Col1,Col2) as (Query_String)
After the above statement,we can write a query referring to
the above query."
11. Explain different types of Ranking Functions?
Ans. Ranking Functions return a ranking value for each row
in a partition. Ranking Functions are Row_Number(),Rank(),Dense_Rank() and
NTILE().
Row_Number() function returns a unique ranking value for
each row irrespective of whether Order by Value is a tie or not.
Dense Rank() function is same as Row_Number() but it
provides the same ranking value to rows in case of ties in the Order by Values.
Rank() function returns the ranking value as 1+No of ranks
that come before the row in question.Thats why Rank function does not always
return consecutive integers.The ranking value depends on how the rows appear in
the result set.
NTILE(integer) over (partition by-order by) function
distributes the rows in an ordered partition into a specified number of
groups.The groups are numbered, starting at one. For each row, NTILE returns
the number of the group to which the row belongs.If the number of rows in a
partition is not divisible by integer_expression, this will cause groups of two
sizes that differ by one member. Larger groups come before smaller groups in
the order specified by the OVER clause. For example if the total number of rows
is 53 and the number of groups is five, the first three groups will have 11
rows and the two remaining groups will have 10 rows each. If on the other hand
the total number of rows is divisible by the number of groups, the rows will be
evenly distributed among the groups. For example, if the total number of rows
is 50, and there are five groups, each bucket will contain 10 rows.
Example of Ranking Functions
C1 RN DR
R NTILE(3)
10 1 1 1 1
20 2 2
2 1
20 3 2
2 1
30 4 3
4 2
30 5 3
4 2
40 6 4
6 2
40 7 4
6 3
40 8 4
6 3
50 9 5
9 3
C1-Column name
RN-Row Number
DR-Dense_Rank
R-Rank
NTILE-NTILE
13. How to get the nth Maximum id in Sql Server 2008?
Ans. Top N * over descending Column dataset having
distinct,Then do select top 1 * from (Earlier Dataset)
14. How to get the department name having the nth Maximum id
in Sql Server 2008?
Ans. Use Dense_Rank over (order by id) and store it in a
temp table using CTE.
Then Select from the temptable where rank=n"
15. How to get the 5th Maximum Salary for each
department?
Ans. Use Dense_Rank over (Partition by department order by
Salary) and store it in a temp table using CTE.
Then Select from the temptable where rank=n
Remember:-Partition divides the result set into different
groups or partitions first and then Order By returns the ranking for each row
inside the Partition based on the column value."
16. How to get Cumulative Sum of a column over a period of
time?
Ans. 1)Using Self Join :-
select r1.priority_key,r1.colour,SUM(r2.priority_key)
from Rep_Priority_Colour r1,
Rep_Priority_Colour r2
where
r2.priority_key <= r1.priority_key
group by
r1.priority_key,r1.colour
order by r1.priority_key (By using self join)
2) Using
Nested Sub-Query:-
SELECT DayCount,
Sales,
Sales+COALESCE((SELECT SUM(Sales)
FROM Sales b
WHERE b.DayCount < a.DayCount),0)
AS RunningTotal
FROM Sales a
ORDER BY DayCount
3) Using
Cursor(A table variable and a variable RunningTotal=RunningTotal+Column value
inside While )
4) Using
Cross apply also we can get that"
17. What is Correlated SubQuery?
Ans. Correlated SubQuery means Repeated Subquery.For each
row returned by the parent or outer query,the subquery executes .The subquery depends
on the outer query for its values.
Eg:select * from dbo.FactTicket where exists (select
Priority_key from DimTicketPriority where
Priority_key=FactTicket.Priority_key)"
18. How to get the last day of the week or first day of the
week?
Ans. 1) Take a reference 0.Do Datediff(ww,0,getdate()) which
will give the difference between the reference and getdate() in terms of weeks.
2) If we want current week's,then do
Datediff(ww,0,getdate())+0 or for current week+n do
Datediff(ww,0,getdate())+n.This is nothing but the number of weeks from the
reference.
3)Dateadd(ww,Datediff(ww,0,getdate())+n,0) will give us the
first day of nth week after Current month.
4)To get the last day for (n-1) week,we can do
Dateadd(s,-1,Dateadd(ww,Datediff(ww,0,getdate())+n,0))
19. How to get the last day of the month or first day of the
month?
Ans. 1) Take a reference 0.Do Datediff(mm,0,getdate()) which
will give the difference between the reference and getdate() in terms of
months.
2) If we want current month's,then do Datediff(mm,0,getdate())+0
or for current month+n do Datediff(mm,0,getdate())+n.This is nothing but the
number of months from the reference.
3)Dateadd(mm,Datediff(mm,0,getdate())+n,0) will give us the
first day of nth month after Current month.
4)To get the last day for (n-1) month,we can do
Dateadd(s,-1,Dateadd(mm,Datediff(mm,0,getdate())+n,0))
Same Rule applies for first day and last day of any
datepart.Datepart Syntax:Datepart(mm,getdate())
20. Difference between Index Scan and Index Seek Scan?
Ans. In Index scan, SQL Server reads the whole of the index
looking for matches. Time taken is proportional to the index size.In Index Seek
Scan,SQL Server uses the b-tree structure of the index to seek directly to
matching records.Index seek is preferable.However,Index scan is more efficient
when the table is small or a large percentage of records match the predicate.
21. What are the different Transactional Faults?
Ans.
1)Dirty Reads
2)Non Repeatable reads
3)Phantom Rows
4)Lost Updates
5)Deadlocks"
22. What is Dirty Reads?
Ans. Transaction 2 can read Transaction 1's uncommitted
changes. Most gregarious.
23. What is Non Repeatable Reads?
Ans. Similar to Dirty Reads,But Transaction 2 can read
committed updates of Transaction 1 also.At any point of time,Transaction 2 should
produce the same result set but if reading a row twice gives 2 different
results then its a non repeatable read.
24. What is Phantom Rows?
Ans. A update in transaction 1 causes the select in
transaction 2 to return different set of rows.Less
25. What is Lost Updates?
Ans. When the second user's updates overwrite the first
user's update when they are both trying to update the same table.
26. What is Deadlocks?
Ans. When 2 transactions with multiple tasks compete for the
common task, then its deadlock. Transaction 1 locks data A and needs to lock
data B to complete the transaction. Transaction 2 has locked data B and needs
to lock data A to complete the transaction. Each transaction is stuck waiting
for the other to release its lock, and neither can complete
until the other does. Unless an outside force intercedes, or
one of the transactions gives up
and quits, this situation could persist until the end of
time.
27. What are the different Transaction Isolation Levels?
Ans. Read Uncommitted, Read committed, Repeatable Read, Serializable,
Snapshot, Read Committed Snapshot
28. What is Read Uncommitted?
Ans. no locks,least restrictive,doesn’t prevent any
transactional faults,good for reporting
29. What is Read Committed?
Ans. prevents Dirty Reads but doesn’t make the system down
with excessive lock contention.Thats why SQL Server's default
30. What is Repeatable Read?
Ans. By preventing dirty reads and non-repeatable reads, the
repeatable read isolation level
provides increased transaction isolation without the extreme
lock contention of serializable
isolation.
31. What is Serializable Read?
Ans. Prevents every transactional faults,Extreme lock
contention,imp in systems where transactional integrity is more imp than
performance like Banking
32. What is Snapshot?
Ans. Prevents reader vs writer conflict.Makes a snapshot
copy of the data being updated.When the update is committed, it’s written back over
the original data. So readers continue to read the same snapshot copy when the
writer has already updated the db. But gave rise to writer-writer conflict.
33. What is Read Committed Snapshot?
Ans. This variation of snapshot isolation behaves the same
as read committed but without the writer versus reader contention.
34. What is Event Logging?
Ans. SSIS logging is feature that enables the tracing of a
package execution. It can be applied to the package level or on the task level.
1. Right click on the package body
2. Chose the log provider type(Text File, Windows Log, XML,
SQL Profiler, SQL Server.)
3. Configure it by placing it's connection.
4. Specify the event in the details tab.Package logging has
a loggingmode which accepts three values-enabled,disabled and
UseParentSetting(To use parent's setting of that component to decide whether or
not to log the data)."
35. What happens when RetainSameConnection Property is set
to true?
Ans. RetainSameConnection Property on OLEDB Connection
manager enables us to handle SQL Server transactions without the use of
MSDTC(Microsoft Distributed Transformation Coordinator.)By using
RetainSameconnection property on the OLEDB Source,we can maintain transactions
over multiple tasks.
36. What is Package Configuration?
Ans. Package Configuration is nothing but exposing the
properties of the package to get updated by the values coming from the File,
Table, Environment variable, Parent package variable or registery key.
37. Send Mail Task(How to write CC,BCC,From and To)?
Ans. Message source type can be a variable, Text file or
direct input.
38. How to abruptly kill a package running on server?
Ans. In write Action of Windows administrators,Kill the
Process running the package,stop the job.
39. How to know which package is running on server?
Ans. SSIS Package Instances is a performance counter that
gives- Total number of simultaneous SSIS Packages running
40. What is FTP task?
Ans. Can able to perform any file operations in server.
41. What is Difference between Lookup and Fuzzy Lookup?
Ans. Lookup gives the exact match where as Fuzzy lookup
gives similar matches.For each row in the input lookup will produce only 1
result but Fuzzy lookup will give many.
42. What is Difference between Control Flow(CF) and Data
Flow(DF)?
Ans. 1)A CF is process oriented while DF is data oriented.
2)A CF comprises Tasks,Containers and precedence Constraints
while DF comprises Sources,Transformations and Destinations.
3)A CF's fundamental unit is Task but a DF's fundamental
unit is component.
4)A CF's outcome may be Success,Failure or Completion but
DF's can be anything."
43. What is Difference between Merge and Union All?
Ans. Merge needs Sorted input whereas Union All does not.
Merge give sorted output but union all does not.
Merge can take only 2 datasets but Union All can take more
than 2 datasets.
44. How to create SubReport?
Ans. Sub-Report is a Report item which can be included
inside the report and for each row of the main report the sub report can be
processed. Sub-Report is actually a reference to a different report.
45. What is Views?
Ans. We can only use select statement inside a view.
•To hide the complexity of the underlying database schema,
or customize the data and schema for a set of users.
•To control access to rows and columns of data.
•To aggregate data for performance.
We can insert,update or delete a view.(if aggregate functions
or joins are not there)
46. When 1 package is being called from another package and
both are deployed in server,then what should be the child package's path
name?
Ans. HTTP Connection Manager
47. Which is the most efficient way of reporting? A
complicated dataset,A Stored Procedure,A dynamic procedure
Ans. Depends on profiler and execution log.
48. How to create a Drill Down Report?
Ans. Create any report, create group over the data region.
On the details set the hidden property true and in toggle by any report item
select the group.
49. How to get error description in SSIS?
Ans. In a Script component we need to write the following
code to get the error description:
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
Inherits
UserComponent
Public Overrides
Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
'Use the
incoming error number as a parameter to GetErrorDescription
Row.ErrorDescription =
ComponentMetaData.GetErrorDescription(Row.ErrorCode)
End Sub
End Class"
50. Explain the Architecture of SSIS?
Ans. SSIS architecture consists of four key parts:
a) Integration Services service: monitors running
Integration Services packages and manages the storage of packages.
b) Integration Services object model: includes managed API
for accessing Integration Services tools, command-line utilities, and custom applications.
c) Integration Services runtime and run-time executables: it
saves the layout of packages, runs packages, and provides support for logging,
breakpoints, configuration, connections, and transactions. The Integration
Services run-time executables are the package, containers, tasks, and event
handlers that Integration Services includes, and custom tasks.
d) Data flow engine: provides the in-memory buffers that
move data from source to destination.
51. What is Synchronous Transformations?
Ans. Row transformations either manipulate data or create
new fields using the data that is available in that row. Examples of SSIS
components that perform row transformations include Derived Column, Data
Conversion, Multicast, and Lookup. While these components might create new
columns, row transformations do not create any additional records. Because each
output row has a 1:1 relationship with an input row, row transformations are
also known as synchronous transformations. Row transformations have the
advantage of reusing existing buffers and do not require data to be copied to a
new buffer to complete the transformation
52. What is Asynchronous Transformations?
Ans. Partially blocking transformations are often used to
combine datasets. They tend to have multiple data inputs. As a result, their
output may have the same, greater, or fewer records than the total number of
input records. Since the number of input records will likely not match the
number of output records, these transformations are also called asynchronous
transformations. Examples of partially blocking transformation components
available in SSIS include Merge, Merge Join, and Union All. With partially
blocking transformations, the output of the transformation is copied into a new
buffer and a new thread may be introduced into the data flow.
53. What is Blocking transformations?
Ans. Blocking transformations must read and process all
input records before creating any output records. Of all of the transformation
types, these transformations perform the most work and can have the greatest
impact on available resources. Example components in SSIS include Aggregate and
Sort. Like partially blocking transformations, blocking transformations are
also considered to be asynchronous. Similarly, when a blocking transformation is
encountered in the data flow, a new buffer is created for its output and a new
thread is introduced into the data flow.
54. What is Fuzzy Grouping?
Ans. 1) The Fuzzy Grouping transformation performs data
cleaning tasks by identifying rows of data that are likely to be duplicates and
selecting a canonical row of data to use in standardizing the data.
The transformation produces one output row for each input
row, with the following additional columns:
1)_key_in, a column that uniquely identifies each row.
2)_key_out, a column that identifies a group of duplicate
rows. The _key_out column has the value of the _key_in column in the canonical
data row. Rows with the same value in _key_out are part of the same group. The
_key_outvalue for a group corresponds to the value of _key_in in the canonical
data row.
4)_score, a value between 0 and 1 that indicates the
similarity of the input row to the canonical row.
We can also specify the value of Similarity Threshold.
55. How to do Fast load without failure of the batch?
Ans. You can’t do fast load without failure of batch. In
fast load option the batch will fail if a record failed to insert.
56. How to execute a Procedure in Execute SQL Task?
Ans. EXEC Procedure Name ?,?,? - '?' denotes the parameters.
And the parmeters will be set in the expression.
57. How to get the connection string dynamically?
Ans. By writing an expression on the connection string
property of the connection manager.
58. How to run Tasks parallelly?
Ans. Each Package has MaxConcurrentExecutables property that
is set to -1 by default,which means the total no of tasks can be executed is
n+2 where n is the no of processors of the system.We can change this number to
an absolute value that indicates absolute no of tasks that can be executed in
parallel.In DF,we can set threads which will execute different execution trees
of the DFT.
59. Which is the Default container of SSIS?
Ans. Task Host Container
60. How to check the quality of data at run time?
Ans. Data Viewer, Data Profiler
61. How to remove Sort transformation?
Ans. You can avoid the Sort transformation while using OLEDB
sources. You can use order by clause to sort the input records. Then set the
Issorted property as true and set the sort key positions of the order by
columns.
62. How to put if else in Data Flow and Control Flow?
Ans. Control flow- Precedence constraints
Data flow-Conditional Split transformation
63. How to set the Buffer in Data Flow?
Ans. Each DFT has 2 properties-DefaultMaxBufferRows and
DefaultMaxBufferSize which control respectively the number of rows which can be
stored in a single buffer and how many buffers can be
created.DefaultMaxBufferRows has default value 10000 rows and
DefaultMaxBufferSize has default value 10 MB.Any data eg BLOB data if cant be
put in the buffer is spooled to the File System.
64. How to ignore the failure in Control Flow and Data
Flow?
Ans. Control Flow-
Use precedence constraints as on completion
Data Flow- Redirect row on failure
65. How to migrate DTS packages to SSIS packages?
Ans. Open BIDS.In
BIDS go to the Project tab and click on Migrate DTS 2000 Package.Give the
Source location and destination location and complete the wizard steps.
66. How you upgrade a package from SSIS 2005 to SSIS
2008r2? Ans.
67. How to create a SSAS Cube and deploy it?(Building a cube
with data source)
Ans. 1.First create an analysis services project.
2.Create a Data Source (Connection made to the server and database
and authentication provided.) by right clicking on the data source of the
project which will start a wizard.(Extension will be .ds)
3.Create a Data Source View(Add tables from one or more
database and add related tables for the added tables.Can also add filters) by
right clicking on the data source view of the project which will start a
wizard.(Extension will be .dsv)
4.Right Click on the Cube and Click on a new cube which will
start a Cube wizardThe Cube Wizard will ask for Select bui.Here we have to
check the measure group tables and check the relevant measures for the measures
group tables and dimension tables.(Extension will be .cube)
5.To hide particular measure groups and dimensions to
particular users,create perspectives.We can rename the already created
perspective and for the new perspective we can check and uncheck based on the
requirement.
6.In BIDS,we can build and then deploy the cube.After that
we can right click on the particular cube and then browse.In browse window,we
will have drag nad drop row fields and column fields.In Row fields basicallly
we will drag dimension rows and in column fields we will drag fact measures.
68. How to create a SSAS Cube and deploy it?(Building a cube
without data source)
Ans. 1.Right Click on Cube.Cube wizard will ask for 'Select
build method-with or without using data source'.Click on ""Build the
cube without using data source"".
2.Then define measures.
3.Define Dimensions and Dimension Usage and complete the
wizard.
4.Now create the datasourceview and create data source in
that.The wizard that pops up this time is called Schema Generation wizard.
5.Populate the data source with data so that cube can be
populated.
6.Process the Cube and View the Cube."
69. How to create an SSRS report using SSAS?
Ans. 1.In the Data Source,put the Source type as Analysis
Services and put the analysis Server name.
2.In the Dataset,click on Query builder.The Query builder
will open the SSAS Query Designer Window.
3.Then we can drag and drop fields on the results pane of
the Query designer as per our requirement.
4.There are several buttons in the tool bar of the Query
builder that can be used in MDX like Add Calculated Members,Show Empty
Cells,Auto Exec,Show Aggregates.
5.The Auto Exec button turns on and off the execution of
queries.
6.The Show Empty Cells works this same as the MDX Non Empty
clause; basically it shows / hides non empty cell values.
7.In the upper half of the design area,the query designer
contains a filter grid which is useful for filtering the results and serves as
an easy way of creation of parameters.
8.Clicking OK on the SSAS query builder gives rise to MDX
query in the query string of Dataset.
9.Now we can make any type of reports using the fields
provided by the dataset."
70. How to create parameters/Multivalued parameters in SSRS
report using SSAS?
Ans. 1.We can put all
values in the filter expression and can check the parameter.
2.By default,in report parameters a report parameter will be
created and a hidden dataset for the parameter will be created.
3.If we right click on the dataset and click on
""Show Hidden Datasets"",then we will get the dataset.
4.But in previewing the report,we will get two All
values-one Select All and another All.One Select All is the property of the
SSRS multivalued Property of the parameter and the second parameter is driven
from the MDX query that SSRS runs to populate the parameter list.
5.We can remove the second one by :-
making a change in
the MDX query i.e. making Parameters.AllMembers as parameters.Children or
filtering out the condition that Parameterlevel > 0 .(Because All of MDX
query will have Parameter level 0)
71. What is Cube Scheduling?
Ans. When we deploy a SSAS Project,the Data Sources,Data
Source Views are put in the local
instance of the SQL Server Analysis Server.Whatever changes are made to the
production database or Analysis service will not be reflected in the local
instance of analysis service until we deploy the project again.We have to
process the cube each and everytime.
We can process the cube manually (using Management Studio).
1)In order to schedule the processing of the cubes at a
particular time,we can make use of Analysis Services Process Task in SSIS and
give the Cube name.Then we can make a job scheduling this Package.
2)Right click on the cube and click on process.Click on
Script and click on Copy to clipboard and cancel.Then make a new job and in
command,paste the same and schedule the job."
72. What is Dimensions?
Ans. Dimension is a context by which we slice and dice data
in a DW.Types :1)Conformed 2)Junk 3) Role playing
73. What is Conformed Dimensions?
Ans. Two dimensions are said to be conformed if they are
completely identical or strict mathematical subsets of each other.Conformed
dimensions have same row headers and labels also.Eg:-Time dimension used for
Fact1 and Time dimension used for Fact2
74. What is Junk Dimensions?
Ans. After database modelling,if there are some attributes
which are not linked to any of the dimensions or fact and they are of less
importance and they carry some distinct values(mostly flag values or
indicators),then in that case we can go for a dimension which carries all these
attibutes.These dimensions are known as junk dimensions.Benefits :-One
dimension having all the attributes will perform better compared to individual
dimension for each attribute.Disadvantage:-If in future we will think of making
an attribute as a fully fledged dimension,then in that case we have to
restructure the cube.
75. What is Degenerate Dimensions?
Ans. When we have columns on the fact table that do not
relate to any of the dimensions but are most required for analysis,then we can
make a separate dimension for the
attribute.This dimension is called Degenerate Dimension.For Eg: A transaction
number or a ticket number which is a most requested feature can be stored
separately as distinct values in a degenerate dimension because the fact will
have multiple records for 1 transaction id.So while quering or analysing
Transaction no or Ticket no,we can go by degenerate dimension table.In some of
the degenerate dimensions,we can put another column which is not degenerate and
a running id for the degenerate transaction id.The difference between normal
dimension and a degenerate dimension is that degenerate attribute should not be
available to the end user for analysis in analysis services.
76. What is Role Playing Dimensions?
Ans. Dimensions are often recycled for multiple applications
within the same database.A Date dimension can be used for
Date_of_Hire,Date_of_Joining etc.So it’s a role playing dimension.
77. Explain Slowly Changing Dimensions?
Ans. Slowly Changing Dimensions are dimensions that have
data that changes slowly rather than changing on a time based regular schedule.
1)Type 0-No action is taken
2)Type 1-overwrites old data with new data and therefore
does not maintain history
3)Type 2-Maintains history by inserting multiple records for
a natural key with different surrogate keys(a running number against a natural
key of the table) and having version numbers or Start Date-End Date.
4)Type 3-Contains a column in the table which represents
Original_state.This column contains the first value of the attribute where as
Current_state contains the present value.This doesnt maintain histories of
values in between the original and current state.
5)Type 4-Contains a table which will have the current record
and another history table with a create date column.
6)Type 6-Combines approaches of types 1,2 and3.It maintains
individual records having start date,end date and Flag.The Current record's
Flag is set to Y rest all are set to N's."
78. What is Factless tables or Bridge Tables?
Ans. If in a fact table there exists many to many
relationships with dimensions with no figures or measures or values then it is
known as Factless fact or bridge tables.This table is needed to bridge the
relationships between dimensions.For Eg suppose we have a product dimensions
which contain product key and product related information and we have an offer
dimension which contain offer key and discount related information,its timing
and pricing.We can make a bridge table of Productoffer which stores the product
which has special offers.If the Fact stores Product,Offer and Sale as
Columns,then we will surely miss the information about products which were
having special offers but not having a sale.
79. What is Snapshot and transaction fact tables?
Ans. A Transaction fact table records events and for each
event certain measurements are recorded or values recorded.So when we record a
sale we create a new row altogether for the information relating to sale.A
snapshot fact table records the state of something at different points of
time.For Example the sales of each product every month.Snapshot Fact table
therefore keeps aggregated information which results in lesser data rows,lesser
cubes,faster cube processing and faster querying.But it has no information on
individual transaction levels.
80. What is UDM(Unified Dimensional Modelling)?
Ans. It provides a bridge between the user and the data
sources.A UDM is constructed over one or more physical data sources and then
the user issues queries against the UDM.
81. What is Two tier architecture of SSAS?
Ans. SSAS uses both server and client components to supply
OLAP and data mining functionality.
1)Server component is implemented as a Microsoft windows
service.
2) Clients interact using Standard XML."
82. Difference between Derived measure and Calculated
measure?
Ans. 1) A derived measure calculates before agggregations
are created and the value is stored in the Cube.
2)A calculated measure calculates after aggregations are
created and the value is not stored inside the Cube."
83. What is Cube?
Ans. Cubes are a logical representation of multidimensional
data.It stores large amounts of historical data.It alos shows the point in time
data and it shows the member level data and security is very detailed.
84. Differences between OLTP and OLAP
Ans. 1)OLTP contains the source information of any
system.OLAP contains the data from OLTP.
2)OLTP controls and runs the fundamental business tasks like
insertion,updation etc but OLAP is meant for planning,analysing and decision
support.
3)OLTP performs transactions like insert,update,delete
etc.OLAP performs querying and analysis of the aggregated data of OLTP.
4)OLTP basically uses smaller and less complex queries but
OLAP uses complex queries.
5)OLTP is normalized with 3N but OLAP is denormalized with
Star Schema or snowflake schema.
6)OLTP uses relational schema but OLAP uses multidimensional
schema.
7)Backup of OLTP is very important but in case of OLAP we
can always bring it from OLTP.
8)Data in OLTP can be seen instantly but in OLAP a batch or
scheduler wil refresh the data."
85. Star Schema vs Snowflake Schema
Ans. Snowflake Schema:No redundancy,normalized,more complex
queries,more joins,more FK's,good to use for small dw,To be used when dimension
table is small in size
Star Schema:Redundant data,de normalized,less complex,good
to use for large dw's,To be used when dimension table is very big in size"
86. What is Normalization?
Ans. Normalization is the mathematical model of evaluating
the relational quality of the data model.The purpose is to reduce update
anomalies.
87. What is First Normal Form(1 NF)?
Ans. The 1st Normal Form states that:
1)Every unit of data should be represented by a scalar
value.
2)Each and every attribute i.e column should be unique in an
entity.
3)Each and every tuple should be unique in an entity.
Consider a case where a table has columns
basecamp,tour1,tour2,tour3.This table violates 1NF because there are so many
attributes holding the same meaning.We can make two tables one having basecamp
id and basecamp and the other table having basecampid and tour to bring it to
1NF."
88. What is Second Normal Form(2 NF)?
Ans. The 2nd Normal Form states that:
1)The entity should be in 1 NF and
2)There should be no partial dependency on the Primary
Key.Suppose in the basecamp-tour table,we have columns basecampid,tour and
phonenumberofthebasecamp.The PK of the table is a composite primary key having
basecampid and tour columns.But in this case phonenumber is dependent on
basecamp not on tour.So this violates 2NF.We can make two tables basecamp-tour
having basecampid and tour and second table having basecampid and phonenumber.
89. What is Third Normal Form(3NF)?
Ans. The 3rd Normal Form states that:
1)the entity should be in 2nF and
2)There should be no transitive dependency on the primary
Key.Suppose there ia atable basecamp-tour in which we have columns like
Basecamp_id,Tour,Guide and GuideDateofHire.In this case GuideDateofHire is
dependent on Guide which again is dependent on Basecamp_id.Thats why this
violates the 3 NF.To resolve this,we can make another table Guide having Guide
and GuideDateofHire."
90. What is Boyce Codd Normal Form?
Ans. It occurs between the third and fourth normal forms.It
handles those kind of entities which have two sets of primary keys.In such case
BCNF states that the entity should be split into two entities one for each
primary key.
91. What is Fourth Normal Form?
Ans. It reduces problems created by complex composite
primary keys.If there are three attributes in an entity in which two of the
attributes combine to form a composite primary key but without the third attribute
the row is not unique,then in that case the entity violates 3NF.In such case we
should split the table into two separate tables.
92. What is Fifth Normal Form?
Ans. The 5NF provides the method for designing relationships
that involve 3 or more entities.A three-way or ternary relationship, if
properly designed, is in the fifth normal form.
93. What is Recovery Models in SQL Server 2008?
Ans. Simple,Full,Bulk Logged
Alter database databasename set recovery simple
94. What is Simple Recovery Model?
Ans. 1)SQL Server stores only minimal information in the
transaction log.
2)The Changes made to the database are written to the data
and to the log.
3)But the log files are not kept permanently.
4)It should be used on a db which is meant for development
or test where there is a reduced need of frequent database backups.
Backup options for this model are Full .
95. What is Full Recovery Model?
Ans. This allows us to recover everything.It can use fullest
set of backup options.So it can be used to recover the fullest set.Every
insert,update,delete is logged into the transaction log.The committed changes
are deleted from the log once the back up of the log is taken.It allows
point-in-time recovery means it allows us to recover the database to a
particular point in time.Backup options for this model are Full,Differential
and Transaction Log.
96. What is Bulk Logged Model?
Ans. This is same as Full Recovery model but this is used in
cases of bulk operations like create index,select into and bulk insert.All the
bulk operations are written minimally to the transaction log there by improving
performance.Backup options for this model are Full,Differential and Transaction
Log.
97. What is Backups?
Ans. Full,Differential and Transaction
98. What is Full Backups?
Ans. A full backup retsores all our data but it will not
restore all the data at a specific point of time.
99. What is Differential Backup?
Ans. A differential back up copies all the data changes
since the last full back up.So these back ups are usually faster and smaller
than Full back ups.A Full backup creates a checkpoint meaning that the
differential back up has to start from that point.
100. What is Transaction Log Back up?
Ans. A transaction log backup will copy all the data changes
since the last Full transaction or previous transaction log back up.they are
fast and small but the demerit is recovery where we will get a transaction log
for each day.
101. What is MSDTC?
Ans. MSDTC is a Transaction Manager which allows the client
application to include different sources of data within one transaction.
For example: SSIS includes different data sources data into
one transaction to main the acid property.
102. What is ACID?
Ans. Atomic, Consistent, Isolated and Durable.
103. What is Report Subscription?
Ans. A Subscription is a standing request to deliver a
report at a specific time or to respond to a specific event, and in an
application file format that you specify in the subscription.
Two types- Standard Subscription, Data driven Subscription
Stanard Subscription- Consists if static values and cann't
be modified during subscription processing.
Datadriven Subscription- get subscription information at
runtime by quering an external data source that provides values used to specify
reciptant, parameter or application format.
104. What is XML from a select query?
Ans. <Select query> For XML RAW/AUTO/PATH/EXPLICIT;
select * from emp for xml raw
<row EmpName=""Shakti""
DeptID=""2"" Sal=""3000""/><row
EmpName=""Praveen"" DeptID=""2""
Sal=""4000""/><row EmpName=""Rohit""
DeptID=""1"" Sal=""2000""/><row
EmpName=""Wamsi"" DeptID=""1""
Sal=""1500""/><row EmpName=""Kapil""
DeptID=""2"" Sal=""1300""/><row
EmpName=""XXX"" DeptID=""10""
Sal=""1000""/><row EmpName=""Rahul""
DeptID=""1"" Sal=""1000""/>
select * from emp for xml auto
<emp EmpName=""Shakti""
DeptID=""2"" Sal=""3000""/><emp
EmpName=""Praveen"" DeptID=""2""
Sal=""4000""/><emp
EmpName=""Rohit"" DeptID=""1""
Sal=""2000""/><emp
EmpName=""Wamsi"" DeptID=""1""
Sal=""1500""/><emp
EmpName=""Kapil"" DeptID=""2""
Sal=""1300""/><emp
EmpName=""XXX"" DeptID=""10""
Sal=""1000""/><emp
EmpName=""Rahul"" DeptID=""1""
Sal=""1000""/>
select * from emp for xml path
<row><EmpName>Shakti</EmpName><DeptID>2</DeptID><Sal>3000</Sal></row><row><EmpName>Praveen</EmpName><DeptID>2</DeptID><Sal>4000</Sal></row><row><EmpName>Rohit</EmpName><DeptID>1</DeptID><Sal>2000</Sal></row><row><EmpName>Wamsi</EmpName><DeptID>1</DeptID><Sal>1500</Sal></row><row><EmpName>Kapil</EmpName><DeptID>2</DeptID><Sal>1300</Sal></row><row><EmpName>XXX</EmpName><DeptID>10</DeptID><Sal>1000</Sal></row><row><EmpName>Rahul</EmpName><DeptID>1</DeptID><Sal>1000</Sal></row>
select 1 as Tag,
NULL as Parent,
EmpName as
[Employee!1!EmpName],
NULL as
[DeptID!2!DeptID],
NULL as [Sal!3!SAL]
from emp
union
select 2 as Tag,
1 as Parent,
EmpName as
[Employee!1!EmpName],
DeptID as
[DeptID!2!DeptID],
null as [Sal!3!SAL]
from emp
union
select 3 as Tag,
2 as Parent,
EmpName as
[Employee!1!EmpName],
DeptID as
[DeptID!2!DeptID],
Sal as [Sal!3!SAL]
from emp
order by EmpName
for xml explicit
<Employee
EmpName=""Kapil""><DeptID
DeptID=""2""><Sal
SAL=""1300""/></DeptID></Employee><Employee
EmpName=""Praveen""><DeptID
DeptID=""2""><Sal
SAL=""4000""/></DeptID></Employee><Employee
EmpName=""Rahul""><DeptID
DeptID=""1""><Sal
SAL=""1000""/></DeptID></Employee><Employee
EmpName=""Rohit""><DeptID DeptID=""1""><Sal
SAL=""2000""/></DeptID></Employee><Employee
EmpName=""Shakti""><DeptID
DeptID=""2""><Sal SAL=""3000""/></DeptID></Employee><Employee
EmpName=""Wamsi""><DeptID
DeptID=""1""><Sal
SAL=""1500""/></DeptID></Employee><Employee
EmpName=""XXX""><DeptID
DeptID=""10""><Sal
SAL=""1000""/></DeptID></Employee>
the values in the Tag and Parent meta columns, the
information provided in the column names, and the correct ordering of the rows
produce the XML you want when you use EXPLICIT mode.
105. What is SET XACT_ABORT { ON | OFF }?
Ans. Specifies whether SQL Server automatically rolls back
the current transaction when a Transact-SQL statement raises a run-time error.
106. What is Computed Column?
Ans. A computed column is computed from an expression that
can use other columns in the same table. The expression can be a noncomputed
column name, constant, function, and any combination of these connected by one
or more operators. The expression cannot be a subquery.
For example, in the AdventureWorks2008R2 sample database, the
TotalDue column of the Sales.SalesOrderHeader table has the definition:
TotalDue AS Subtotal + TaxAmt + Freight.
Unless otherwise specified, computed columns are virtual
columns that are not physically stored in the table. Their values are
recalculated every time they are referenced in a query. The Database Engine
uses the PERSISTED keyword in the CREATE TABLE and ALTER TABLE statements to
physically store computed columns in the table.
Computed columns can be used in select lists, WHERE clauses,
ORDER BY clauses, or any other locations in which regular expressions can be
used, with the following exceptions:
Computed columns
used as CHECK, FOREIGN KEY, or NOT NULL constraints must be marked PERSISTED. A
computed column can be used as a key column in an index or as part of any
PRIMARY KEY or UNIQUE constraint if the computed column value is defined by a
deterministic expression and the data type of the result is allowed in index
columns.
For example, if
the table has integer columns a and b, the computed column a + b can be
indexed, but computed column a + DATEPART(dd, GETDATE()) cannot be indexed
because the value may change in subsequent invocations.
A computed column
cannot be the target of an INSERT or UPDATE statement.
It 's an amazing and awesome blog. Thanks for sharing
ReplyDeleteMsbi Online Training in Hyderabad
Msbi Online Training in India
ReplyDeleteThanks for this blog keep sharing your thoughts like this...
CC++ Training in Chennai
C++ Online Course