Friday, September 08, 2017

Wake up and Learn Big Data

Wake up and Learn BIG DATA:
I am going to start a tutorial series on big data for the SQL developers. This is the time you should learn big data and start improving your skills parallel to survive in current market scenario.
In this tutorial, I will be guide you through apache Hadoop installation, development and other big data technologies.
In current scenario, the companies are considering new technologies for their new solutions which will create new jobs so you need to consider these learnings for you profile improvement.
Course content:
1.     What is big data:
2.     Why Big Data is required.
3.     Different big data technologies.
4.     Back ground of apache Hadoop.
5.     Prerequisite for installing Apache Hadoop.
6.     Installing ubuntu with dual boot system.
7.     Understanding basis architecture of Apache Hadoop.
8.     Setting up machine for Hadoop.
9.     Hadoop single node cluster installation.

Thursday, June 01, 2017

What are the types of database schema in data warehouse ?

Star Schema:

A star schema is the one in which a central fact table is sourrounded by denormalized dimensional tables. A star schema can be simple or complex. A simple star schema consists of one fact table where as a complex star schema have more than one fact table.

Snow Flake Schema:

A snow flake schema is an enhancement of star schema by adding additional dimensions. Snow flake schema are useful when there are low cardinality attributes in the dimensions.

Galaxy Schema:

Galaxy schema contains many fact tables with some common dimensions (conformed dimensions). This schema is a combination of many data marts.

Fact Constellation Schema:

The dimensions in this schema are segregated into independent dimensions based on the levels of hierarchy. For example, if geography has five levels of hierarchy like teritary, region, country, state and city; constellation schema would have five dimensions instead of one.

Monday, August 15, 2016

What are the limitations in SSRS on SQL Server express edition?

Microsoft offers reporting services free as part of SQL Server Express with Advance Services edition. But it has the following limitations:

Management Studio cannot be used to administer report server
Report Models will not be available
Report Builder is not available
Caching, History and Delivery of Report is not available.
SQL Server agent is not available
No scheduling is possible
Remote server database is not available for Report Data Source (Local SQL Server is a only option,)
We cannot store the report server database on a remote server (it has to be local only)
Reports can be rendered only in Excel, PDF, Image formats only
Reporting Services will not be able to use more than 1 GB of RAM
No Subscriptions (Standard and Data Driven) can be made
Can not be integrated with Share Point
Can not implement Role based security
Only named instances is supported
Scale-out Report Servers will not be available

Wednesday, June 01, 2016

What is Statistic?

The science that deals with the collection, classification, analysis, and interpretation of numerical facts or data, and that, by use of mathematical theories of probability, impose order and regularity on aggregates of more or less disparate elements.
There are 2 parts of definition
 the collection, classification, analysis, and interpretation of numerical facts or data
the use of probability theory to impose order on aggregates of data
In simpler words, statistics deals with summarizing information about data in a meaningful and relevant way.

Statistics is also using data to predict things that are unknown. We can be 95% confident more people will vote for Candidate A than B.

Statistical analysis is like solving mysteries with data. We start with questions and attempt to answer them with data instead of our intuition. When we assemble enough data we make predictions.

With predictions, there's always a chance that we'll be wrong. Much of statistics is understanding what we know from data we do have, making our best prediction about data we don't have, and clearly understanding the chance that we're wrong and quantifying that