Saturday 29 August 2015

Role Playing Dimension in Tabular Model(SSAS)

Role Playing dimension is a single dimension which has multiple references with the fact. Each of these references logically means a different role. Date dimension is the commonly used role playing dimension. The fact dimensions used to have different date columns and the date dimensions used to have reference to these columns.

In Tabular model there is no direct way to define different role to the a dimension. There are some work around to available to get role based dimension feature. 

Let's take the AdventureWorks example:

In this example we will take FactInternetSales dimension and DimDate dimesion. Following are the date keys those are present in the fact:

  1. OrderDateKey
  2. DueDateKey
  3. ShipDateKey
Create the first relationship between FactInternetSales and DimDate on the basis of Order Date.


This is the active relation between FactInternetSales and DimDate.

Now create the relation between FactInternetSales and DimDate on the basis of Ship Date.


You can find the active checkbox is unchecked and there a dotted line between FactInternetSales and DimDate which represent the inactive relationship.


This is how you can have multiple relationships between a fact and a dimension. Now we will use these relations while calculating measure.

Create a measure Internet Total Sales:=SUM([SalesAmount]).


Process the cube and analyse the data in an excel. After analyzing you will found the measure is calculated on the basis of active relationship which is Order Date.




Now let's create another measure by using the inactive relation Ship Date.

Internet Total Sales on Ship Date:=CALCULATE(SUM([SalesAmount]),USERELATIONSHIP(FactInternetSales[ShipDateKey],DimDate[DateKey]))




Process the cube and analyse the data in an excel. After analyzing you will found the measure is calculated on the basis of in-active relationship which is Ship Date.




Sunday 5 April 2015

Triggering a SSIS package by placing a file in a particular location

In SQL server 2012 you can trigger a SSIS package by placing a file in a particular location. This can be done by using a File table feature present in SQL Server 2012. This is a new feature introduced in SQL Server 2012.

File Table is a special table where you can store a file in SQL Server but the files can be accessed as if it is stored inside file system.

When a file is placed inside the File table location a record get inserted inside the File table. You can have after insert trigger over the File table and the trigger intern execute the package.

Enable File Steam feature inside the Server:

Step1: Execute the following query:

EXEC sp_configure filestream_access_level, 2
RECONFIGURE

Step2: Open the Sql Server Configuration Manager and go to the properties of the SQL Server. Open the FileStream tab.


Enable File Stream for the database

Step1: Run the following code for your database:

ALTER DATABASE database_name
SET FILESTREAM ( NON_TRANSACTED_ACCESS = FULL, DIRECTORY_NAME = N'directory_name' )

Step2: Open the properties of the database. Go to the Filegroup tab and add FileSTREAM:


Step3: Go to the Files tab and add a File in the file category as 'FILESTREAM Data':


Creation of the File Group:

Create the File Group by executing the following query:

CREATE TABLE FileDropLoc AS FileTable
WITH
(FileTable_Directory = 'FileDropLoc');

Then you can access the file location as:

\\<ip-address>\mssqlserver\DarkHorse\FileDropLoc

Place a file in the above path and run the following query:


You can find one record is present inside the File table. 

You can also find the file table created under the File table hierarchy 


Now you can write a after insert trigger on the File table and can execute the package that you want by using dtexec command line utility or you can run the SQL server job which intern run the package.

Saturday 4 April 2015

How to use fast load option in OLEDB destination(avoiding bulk data rejection due to violation of constraints)?

The common issue in OLEDB destination with fast load option is if some error occurs due to constraint violation the whole batch will fail and it will be redirected to the error flow. In this blog i will show you some work around.
  1. Remove all the constrains from the table.
  2. Have a trigger to validate the constrains like: not null, check constrain where the constraint only applied on the row only.
  3. Have a stored procedure to validate the constraints like: primary key, unique
Let me illustrate this by an example:

There is a table called Emp which is having the following columns and constraints:

DeptID column is having a reference to the DeptID column of the Dept table. Structure of the Dept table is below:


The data inside the Dept table is below:



For the work around i have created two tables Emp_Temp and Emp_Error.

Emp_Temp: Same structure as EMP without any constraints.
Emp_Error: Same structure as Emp_Temp with an extra column named ErrorDesc to store the error description.

Now we will populate the Emp table by picking data from a flat file source. Following is the data present inside the flat file:





The package should be like this:


There should be a trigger written on the top of EMP_Temp table. The trigger should be as follows:


CREATE TRIGGER dbo.Tri_EMP
   ON  dbo.Emp_Temp
   AFTER INSERT
AS 
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

    -- Insert statements for trigger here
Insert into Emp_Error (
EMPID,
EMPNAME,
DeptID,
Salary,
ErrorDesc
)
SELECT 
EMPID,
EMPNAME,
DeptID,
Salary,
'Not null violation'
FROM inserted
Where Salary is null
OR EMPNAME is null
END
GO

Following property need to be set in the OLEDB Destination so that the trigger will be fired:


There should be a procedure as below which will enforce the PK and FK constraints:

ALTER PROCEDURE EMP_Constraint
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;

    Insert into [dbo].[Emp_Error](
EMPID,
EMPNAME,
DeptID,
Salary,
ErrorDesc
)
SELECT 
EMPID,
EMPNAME,
DeptID,
Salary,
'PK Violation.'
FROM dbo.Emp_Temp
WHERE EMPID IN (SELECT EMPID FROM Emp_Temp GROUP BY EMPID HAVING COUNT(1)>1)

Insert into [dbo].[Emp_Error](
EMPID,
EMPNAME,
DeptID,
Salary,
ErrorDesc
)
SELECT 
EMPID,
EMPNAME,
Emp_Temp.DeptID,
Salary,
'FK Violation.'
FROM dbo.Emp_Temp
LEFT
JOIN dbo.Dept
ON Emp_Temp.deptid=Dept.DeptID
WHERE Dept.DeptID IS NULL

DELETE dbo.Emp_Temp
WHERE EMPID IN (SELECT EMPID FROM Emp_Error)
END

Once the data moves into the EMP_Temp table then that need to be transferred to the actual EMP table.
Try it out for your scenario it even works for bulk amount of data. I have used it for data of size 50 million. Good luck.