Friday, 27 October 2017

111 Approach - On Premises Enterprise BI Architecture & Design from a bird's-eye view.

111 Approach (One, One, One, Approach)

How fast can the data be provided and with what level of accuracy? What's the best approach?

The best approach is referred to as the 111 Approach. This approach accomplishes the following:
  • Minimum BI interference/dependency to get the ball rolling
  • Agile approach, a standard software engineering approach
  • Self-serve BI by the business, an accuracy in providing the requirement before going to the IT department/Designers.
  • Understanding the data in advance and getting feedback before design and production
  • Providing relational information within the requirement
See the overall objective and complete agenda here: Overall Objective and Agenda

111 Approach Overview

1 Day

Assuming you have a BI environment up and running, your data scientist is dealing with all the corporate data and suddenly he comes up with an idea. He imports some external data into any tool that he is comfortable with (in this case, I picked Power BI) and studies the data.


1 day data scientist


Finally, a new self-service report is created that is handed over to the analyst/business team for feedback.


1 day final

1 Week

The data scientist asks the IT department to find an environment for the external data so that it can be shared within the team. The IT department copies the external data to an environment so that it can be accessed, but the IT department will not support this new entity, this is a one day job for the IT department.

1 Week Initial

Now the analyst/business team studies the data and provides feedback, documentation and most important part in the requirement for the IT department/designers.

At this point we have enough information/requirements to confirm the design.

1 Week Final

1 Month (1 Sprint)

The IT department/designers now have the requirements and can start adding the new external data into the current corporate BI environment so that it follows all the BI standards. For example, the external data will be incrementally loaded, it will follow in the auditing process, data cleansing, framework reports, etc.

1 month

Sunday, 15 October 2017

Today's BI Design - On Premises Enterprise BI Architecture & Design from a bird's-eye view.

Today's Enterprise BI Architecture

A good BI enterprise design must be done as though the application will be for sale. It should be able to accommodate any input data, or be as flexible as required for that particular BI business logic. The legacy data should depend on the Source DB because the Source DB has a closer design to the BI DW. 
 
See the overall objective and complete agenda here: Overall Objective and Agenda
 
Current Enterprise BI Architecture

Generally you will need the following databases to design an optimized BI environment:

  1. DL DW
    A Data Lake data warehouse is known as the Source DB, it's the landing area of the data. Some designers like to call it a flat database because it dose not have any Foreign Keys (FK) and the tables likely have no more than 2-3 indexes.
  2. BI DW
    This is the location where part of the business logic is applied. Other things like SCD Type2 using HASHBYTE also happens in this database. You must know the business to create optimized tables with the right Primary Keys (PK) and indexes. One of the things that I generally do is to create all the FK constraints and make sure that they are all disabled.
  3. DM
    A DM is a database but not a SQL database and it's the second location where business logic is applied. When I am evaluating a DM I look for:

    1. How many measures are being designed.
    2. Check the hierarchy and cross join of the hierarchies
    3. Perspectives
    4. Partitioning
    5. Dynamic Processes
    6. Documentation
    7.
    See more
  4. Config DB
    The configuration database controls the entire enterprise application. It contains lots of non-confidential information and other tools such as:

    1. BI Incremental load mechanism
    2. HASHBYTE Process
    3. Internal/External logs
    4. Auditing
    5. Report generator tools
    6. Multi-language support
    7. Documentation
    8. Date Dimension
    9. SQL and DM Partition
  5. Other Databases
    Depending on your companies security/design policies you might need more databases/environments for example you will need SSISDB for SSIS Catalog deployment, or you might need [msdb] database because you are trying to fire off a SQL job using a front-end tool.

The best practice is a combination of the old approach and the new BI Architecture.



Friday, 6 October 2017

Legacy BI Design - On Premises Enterprise BI Architecture & Design from a bird's-eye view.

Objective

It's always good to know the history of everything, including the history of BI, how it started, how it changed and transformed in to what it is today. Did you know that the word BI was born around the year 2005? Prior to that, it was know as Management Information System (MIS).

See the overall objective and complete agenda here: Overall Objective and Agenda

In the Late 90's...

Back in the 90's, a copy of the day to day Transaction DB was taken. It was known as the "Sync DB" or "Reporting DB", all the reports were pointing to this database. There were many ways to achieve this, including SQL Replication, SQL Mirroring, Log Shipping and etc... The report was basically SSRS and Crystal Report.

Why was it designed like this? The Transaction DB was getting big and the number of transactions was growing. The reports were locking/blocking the transactions and making the system slow.

Late 90's Reporting

Circa the year 2000

The Sync DB was renamed to Golden DB and some basic logic was added. The Golden DB definition was never clear.



Year 2000 reporting

2005-2015ish

This was the start of the cube era; the Golden DB was designed as if it were a Transactional DB and not a BI DB.

Most BI Designs



Thursday, 5 October 2017

On Premises Enterprise BI Architecture & Design from a bird's-eye view.

Objective

Over the years I have been asked by many people how I design an enterprise level BI from scratch. They're looking to learn what approaches I take to make my design perform at a high level of accuracy and efficiency. To achieve these things, we need to look at many aspects of the overall design, including the framework in use and the set of tools that are available or we have to design.

In order to explain my process in depth and help you better understand the approach I take, I will create a series of blog posts to cover BI Design of the past and present, as well as the recommended approach and BI tools. I have also put together a PowerPoint presentation that I will update over time to include other topics covered in this blog series. You can access this PowerPoint presentation here: OnPrem Enterprise BI Design.pptx

Agenda

  1. Legacy BI Designs
  2. Today's BI Design
  3. List of BI Tools
  4. 111 Approach
  5. Reporting Frameworks

Saturday, 20 May 2017

Partition - Document and Track changes in your Tabular Model

Objective

Extract the metadata from a tabular model using a PowerShell script and SQL Server 2016. In this section, we are going to extract the list of Tabular partitions and their corresponding metadata.

Requirements

Please see the requirements listed here: Document and Track changes in your Tabular Model

Steps

If you do not want to store metadata in a separate database, please proceed to step 4.
  1. Download the script file: 0500 Partition.sql
  2. Run the script in SSMS.
  3. Download the PowerShell script: 500--Partition.ps1.
  4. Use PowerShell ISE (or other PowerShell tool) to open the PowerShell file.
  5. Set the parameters in the PowerShell script to align with your environment (see Requirements).
    1. This script can either display the information or populate the table created in the step above.
  6. Run the PowerShell script (generally I use a *.bat file to load all of my PowerShell scripts).

Conclusion

Once the above steps have been completed, your SQL Server instance will have a table called [Partition] within your [ASTabularPropertyCollector] database containing the metadata information for your SQL tabular model.

Querying this table should provide results similar to those seen below.

Partition

Thursday, 4 May 2017

Creating SSIS Package Components and Features Programmatically using .NET

Problem Statement

This might sound strange, but I am writing about a subject that has been around since 2005. However, since then, the objects within SSIS 2016 have changed and I recently had a customer request that prompted me to document my findings.

I needed to add and configure some features in an SSIS package and I had old .NET code that did the magic for me in 2007. I used the same code for my SQL 2016 SSIS package but it didn't work. WHAT?!?!

Solution

Some of the changes that were required included the following:
  1. OLE provider is now "Provider=SQLNCLI11.1"
  2. LogProvider object in SQL 2016 is no longer 2, it's 5: "DTS.LogProviderSQLServer.5"
    1. NOTE: As of when this blog post was written, this information could not be found on MSDN Online.
  3. And more...

Benefits of having objects in the package created programmatically

  1. The package framework allows for a standardized approach.
  2. Achieve more consistency in your SSIS packages created through package framework.
  3. Save time and be more accurate in designing the required changes.
  4. Understand the inner layer of SSIS package design.

Solution Breakdown

I am going to create or configure the below objects in a sample SSIS package
  1. Add a Connection Object
      Dim conMgrDest As ConnectionManager = Pkg.Connections.Add("OLEDB")
  2. Add Event Handlers
      Pkg.EventHandlers.Add("OnError")
  3. Set the SSIS Logging
    Dim logProv As LogProvider=Pkg.LogProviders.Add("DTS.LogProviderSQLServer.5")
    Pkg.LoggingOptions.SelectedLogProviders.Add(logProv)  Pkg.LoggingMode = DTSLoggingMode.Enabled Pkg.LoggingOptions.EventFilter = arrayEventFilter

Steps

  1. Download the Sample Solution
  2. I have created 4 empty SSIS packages that are in the "SampleEmptySSISPackages" folder, in the case of my customer I had 300 packages.
  3. Copy all packages into the "ToBeProcessed" folder.
  4. Open the solution, "DotNetCodeOnSSISPkgs.sln".
  5. Open "pkgFwDotNetCodeOnSSISPkgs.dtsx" and go to Parameters.
  6. Make sure that the "uPara_FolderPath" is pointing to the location of your "ToBeProcessed" folder.
  7. To create the Connection Manager, you will need to set the "SSISLogServer" and the "SSISLogDB".
  8. Run the SSIS package

Validating Results

To check the results, open the packages in the folder "ToBeProcessed" in Visual studio. You will see the changes mentioned below:
  1. New Connection Object "conn_OLEDB_SSISLoggingConnection" is added
  2. The Event Handlers: OnError, OnPostExecute and OnPreExecute events have been added at the package level.
  3. In the menu SSIS --> Logging, you will see the following items:
    1. Containers: Package is checked
    2. Providers and logs: new name has been added, "My SSIS Log Provider for SQL Server", and the same connection object has been used.
    3. Details: OnError, OnExecuteStatusChanged, OnPostExecute, OnPreExecute are all checked
NOTE: To do the test again, you must copy the sample files from the "SampleEmptySSISPackages" folder into the "ToBeProcessed" folder and rerun the "pkgFwDotNetCodeOnSSISPkgs.dtsx" package.

References

Tuesday, 22 November 2016

Perspective Hierarchy - Document and Track changes in your Tabular Model

Objective

Extract the metadata from a tabular model using a PowerShell script and SQL Server 2016. In this section, we are going to extract the list of Tabular perspective hierarchies and their corresponding metadata.

Requirements

Please see the requirements listed here: Document and Track changes in your Tabular Model

Steps

If you do not want to store metadata in a separate database, please proceed to step 3.
  1. Download the script file: 0386 PerspectiveHierarchy.sql
  2. Run the script in SSMS.
  3. Download the PowerShell script: 386--PerspectiveHierarchy.ps1
  4. Use PowerShell ISE (or other PowerShell tool) to open the PowerShell file.
  5. Set the parameters in the PowerShell script to align with your environment (see Requirements).
    1. This script can either display the information or populate the table created in the step above.
  6. Run the PowerShell script (generally I use a *.bat file to load all of my PowerShell scripts)

Conclusion

Once the above steps have been completed, your SQL Server instance will have a table called [PerspectiveHierarchy] within your [ASTabularPropertyCollector] database containing the metadata information for your SQL tabular model.

Querying this table should provide results similar to those seen below.

Perspective Hierarchy