What is Data Mining?

Data Mining is the process of identifying trends in large data sets.
Steps are as following:

  1. Business understanding
  2. Data understanding
  3. Data preparation
  4. Modeling
  5. Evaluation
  6. Deployment

The data is usually collected and stored in data warehouses.
Then we apply suitable data mining algorithms for identifying trends.
Most popular algorithms are clustering and regression trees.

Data Mining can be done for:

  1. Mining for patterns
  2. Mining for associations
  3. Mining for correlations
  4. Mining for clusters
  5. Mining for predictive analysis

What Is Deep Neural Networks?

A deep neural network (DNN) is an artificial neural network (ANN) with multiple hidden layers between the input and output layers. DNNs can model complex non-linear relationships. DNN architectures generate compositional models where the object is expressed as a layered composition of primitives. The extra layers enable composition of features from lower layers, potentially modeling complex data with fewer units than a similarly performing shallow network.

Deep architectures include many variants of a few basic approaches. Each architecture has found success in specific domains. It is not always possible to compare the performance of multiple architectures unless they have been evaluated on the same data sets.

DNNs are typically feedforward networks in which data flows from the input layer to the output layer without looping back.

Recurrent neural networks (RNNs), in which data can flow in any direction, are used for applications such as language modeling. Long short-term memory is particularly effective for this use.

Convolutional deep neural networks (CNNs) are used in computer vision. CNNs also have been applied to acoustic modeling for automatic speech recognition (ASR).

Reference:https://deeplearning4j.org/neuralnet-overview

Many  application is developed for Deep Learning which are  very help to other famous applications.

What Is Data Wrangling?

Data wrangling is the process of cleaning, structuring and enriching raw data into a desired format for better decision making in less time. In other words, it is the process of cleaning and unifying messy and complex data sets for easy access and analysis.

  1. With the amount of data and data sources rapidly growing and expanding, it is getting more and more essential for the large amounts of available data to be organized for analysis.
  2. This process typically includes manually converting/mapping data from one raw form into another format to allow for more convenient consumption and organization of the data.

The goals of data wrangling:

  1. Reveal a “deeper intelligence” within your data, by gathering data from multiple sources
  2. Provide accurate, actionable data in the hands of business analysts in a timely matter
  3. Reduce the time spent collecting and organizing unruly data before it can be utilized
  4. Enable data scientists and analysts to focus on the analysis of data, rather than the wrangling
  5. Drive better decision-making skills by senior leaders in an organization

The key steps to data wrangling:

  1. Data Acquisition: Identify and obtain access to the data within your sources
  2. Joining Data: Combine the edited data for further use and analysis
  3. Data Cleansing: Redesign the data into a usable/functional format and correct/remove any bad data

How to Remove Duplicate Data in R

During the processing of data cleansing, it is often required to remove duplicate values from the database. A very useful application of subsetting data is to find and remove duplicate values. R has a useful function, duplicated(), that finds duplicate values and returns a logical vector that tells you whether the specified value is a duplicate of a previous value. This means that for duplicated values, duplicated() returns FALSE for the first occurrence and TRUE for every following occurrence of that value, as in the following example:

> duplicated(c(1,2,1,6,1,8))
[1] FALSE FALSE TRUE FALSE TRUE FALSE

If you try this on a data frame, R automatically checks the observations (meaning, it treats every row as a value). So, for example, with the data frame iris:

> duplicated(iris)
 [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [10] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
....
 [136] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE FALSE
[145] FALSE FALSE FALSE FALSE FALSE FALSE

If you look carefully, you notice that row 143 is a duplicate (because the 143rd element of your result has the value TRUE). You also can tell this by using the which() function:

> which(duplicated(iris))
[1] 143

Now, to remove the duplicate from iris, you need to exclude this row from your data. Remember that there are two ways to exclude data using subsetting:

  • Specify a logical vector, where FALSE means that the element will be excluded. The ! (exclamation point) operator is a logical negation. This means that it converts TRUE into FALSE and vice versa. So, to remove the duplicates from iris, you do the following:

> iris[!duplicated(iris), ]

Specify negative values. In other words:

> index <- which(duplicated(iris))
> iris[-index, ]

In both cases, you’ll notice that your instruction has removed row 143.

How To Install R & R-Studio

R is a fundamental open source, case-sensitive programming language. RStudio is an active member of the R community and an integrated development environment (IDE)for R.

You need to install both R and R-Studio on your system before actually getting started with R. In this page, you will be guided through the installation process and get introduced to both of them.

Install R

Step 1: Download the package relevant to your system (Windows or Mac or Linux) from the Comprehensive R Archive Network (CRAN) website.

Step 2: Install R like you normally install any new software package.

Now, Install R-Studio

Step 1: Download the R-Studio Desktop package from the R-Studio website.

Step 2: Install R-Studio using user’s setup process.

Before you move on, make sure you have installed both R and R-Studio on your system. In this lecture, you will be introduced to different components of R-Studio.

How To Host a Static Website on WampServer

Hope you have already installed WAMP on your Windows machine.In case not then Visit WampServer Installation.

1.After the installation of Wamp go to System Tray or Notification Area as shown in screen shot below and click on wamp icon.

2.Now click on www Directory,it will open a folder.

3.Delete all the files there and paste the HTML project which was working fine on local machine with out server.

In our case project is in folder \cakewebsitetemplate

4.To run the project go to System Tray or Notification Area->Click on Wamp icon>Click on Localhost .You will find the project folder there.

5.Click on the project i.e cakewebsitetemplate

Congrats,Your Static website is running now on WampServer.

Steps To Install WordPress on Wamp Server

WordPress is an free and open source blogging and CMS tool. WordPress is easy to manage and at the same time, a very powerful tool. One can easily extend the functionality of WordPress by installing plugins. It is estimated that 15% of the present websites are built on WordPress. To your surprise, WPWebHost is also built on WordPress.
WampServer is an open source web development platform on Windows. It allows you to create web applications with Apache2, PHP and a MySQL database.WampServer consists these software: Apache, MySQL and PHP/phpMyAdmin.

Why to install WordPress on WampServer?

WordPress is a popular choice today for people to create websites because of its user-friendliness and powerful features. But, Installing WordPress on your hosting directly and getting along with it can be a tough process. Also, you need to test your website, theme, etc to rest assure that it will not crash on the live server. So, to do this task, we setup a WampServer (which has all the basic requirements to install Worpdress), install WordPress on WampServer and then test our website before its final launch. Also, you get hand-to-hand experience with WordPress.

1~ Download setup files

  1. Download ‘WampServer’ from its download page: http://www.wampserver.com/en/#download-wrapper
  2. Download ‘Worpdress’ from its download page: https://wordpress.org/download/

2~ Install WampServer

Lets install WampServer on your Windows. Just follow the simple six steps visual procedure:
(Though, WampServer 3.0.6 was used for this tutorial, you will find it helpful for any version of WampServer.)
NOTE: I will suggest not to change the names I’ve used in the tutorial, else you can find yourself in trouble. Please only change names if you’re sure you can keep up with the changes.

1. Start the setup by opening the file you downloaded in step 1 of downloading setup files.

2. Select to Accept the agreement and Click ‘Next’ button.

3. Select your installation folder and click ‘Next’. I will recommend you to leave it as it is.

4.Click ‘Next’ to complete the installation

5.At the last screen click on Finish to complete the installation.

Survival Of The Fittest

The Darwin’s famous theory. The phrase “Survival of the fittest” can be used in the context where a company is able to sustain in the volatile global market.

Recently I thought to order few books and then came to know about Flipkart 10th Anniversary Sale.

Flipkart which was founded by Sachin Bansal & Binny Bansal in 2007, both alumni of the IIT,Delhi. Worked in Amazon came a long way.

After 3 years that is 2010 they started the acquisition game and now they have big names on their acquisition list.

Not only they acquired Myntra they brought its rival Jabong the fashion shopping website too.

Might payzippy was a failure but they realised the power of digital money before demonetization and acquired payment start-up PhonePe In April-2016. In the recent sale, they are promoting PhonePe badly by offering 30% Cashback.I decided not to install it today as I already have 7 digital payment app in my One Plus 3 phone.

A couple of years ago there were many fish in e-commerce market but today buy from Myntra, Jabong or Flipkart, Flipkart will get the benefit.

Flipkart 10th Anniversary Sale is being promoted from the home page of Jabong and Myntra as well.

Where is Snapdeal?

It cut throat competition Flipkart survived, took full benefit of digital tools and the big fish is going to eat Snapdeal now. Once Flipkart acquires Snapdeal then only 2 competitors will be left in market i.e PayTm and Amazon. I believe competition is necessary too.

Though Snapdeal acquisition will benefit Tiger Global more than Flipkart but doesn’t seem a bad deal in long run.

Till now I purchased from all the above-mentioned site to test them or to get better products. Even today I ordered 3 books to get few more drops of knowledge available in it.

Lets’ see what will be the next step after Snapdeal acquisition.

How To Open and Handle Large CSV Files

Recently I was struggling to open Large CSV files and I was using Excel 2013 on Windows 7 64-bit machine.When I was opening big CSV it was throwing error.

With Excel 2013,it is said that the  limit is now gone the only restriction is your machines memory. However, you can still not edit the raw data, only handle (DAX) and aggregate (pivot tables) them! The row limit in Excel 2013 is still 1,048,576.

At that point I started searching for Software which can open large file,then I came across Delimit and it was very helpful.

Features are as follow:

  1. Quickly open any delimited data file.
  2. Edit any cell.
  3. Easily convert files from one delimiter to another like CSV to TAB.
  4. Split-up any delimited file into file parts of equal size.
  5. Join multiple delimited files into one resulting file.
  6. Quickly select which columns to extract and in which order.
  7. Extract data from any delimited file by specifying the columns,rows and/or filter to apply.
  8. Sort any delimited data file based on cell content.
  9. Remove duplicate rows based on user specified columns.
  10. Bookmark any cell for quick subsequent access.
  11. Open large delimited data files; 100s of MBs or GBs in size!
  12. Open data files up to 2 billion rows and 2 million columns large!
  13. Work with: character delimited, string delimited, fixed column width or just plain text files.
  14. Quickly see all your bookmarks, double-click to jump to any of them or click to rename.
  15. Keep track of long running operations.
  16. Keep track of the current selection.
  17. Scroll to any part of the file or split the view into multiple panes.
  18. Freeze the 1st row of any file.
  19. Open multiple files and quickly switch between them.
  20. Configure built-in and custom file delimitation rules for automatic parsing of files.

It is available with 15 day free Trial and can be downloaded from http://www.delimitware.com/

Other option to open big file is Sublime Text.It is a proprietary cross-platform source code editor with a Python application programming interface.Click here to download.