MD. MAHBUBUL AMIN MAJUMDER'S SITE
HOME
ACADEMIC STUDY
RESEARCH
PROFESSION
MY INTEREST
MY PICTURES
PERSONAL INFO
CONTACT
 


Data Mining will be more profitable than Coal Mining

 

 


 
 
What is Data Mining
 
Data mining is the extraction of hidden predictive information from large databases.
It is, in some ways, an extension of statistics, with a few artificial intelligence and machine learning twists thrown in.
Like statistics, data mining is not a business solution, it is just a technology.
It uses a combination of machine learning, statistical analysis, modeling techniques and database technology
Data mining finds patterns and subtle relationships in data and infers rules that allow the prediction of future results
Its typical applications include market segmentation, customer profiling, fraud detection, evaluation of retail promotions, and credit risk analysis
It is being used both to increase revenues (through improved marketing) and to reduce costs (through detecting and preventing waste and fraud)

 

The foundations of data mining

Go top

 The following is the time line of data mining evolutionary period:

Evolutionary Step

Business Question

Enabling Technologies

Product Providers

Characteristics

Data Collection

(1960s)

"What was my average total revenue over the last five years?"

Computers, tapes, disks

IBM, CDC

Retrospective, static data delivery

Data Access

(1980s)

"What were unit sales in New England last March?"

Relational databases (RDBMS), Structured Query Language (SQL), ODBC

Oracle, Sybase, Informix, IBM, Microsoft

Retrospective, dynamic data delivery at record level

Data Navigation

(1990s)

"What were unit sales in New England last March? Drill down to Boston."

On-line analytic processing (OLAP), multidimensional databases, data warehouses

Pilot, IRI, Arbor, Redbrick, Evolutionary Technologies

Retrospective, dynamic data delivery at multiple levels

Data Mining

(2000)

"What's likely to happen to Boston unit sales next month? Why?"

Advanced algorithms, multiprocessor computers, massive databases

Lockheed, IBM, SGI, numerous startups (nascent industry)

Prospective, proactive information delivery

 ref: http://www.thearling.com/text/dmwhite/dmwhite.htm (October 2008)

What data miners do?

Go top

Now a days, almost all the organisations have

Massive data collection and storage
Powerful multiprocessor computers

All that they need is to discover a suitable data mining algorithm by which they can offer the Right Person at the Right Time through the Right Channel. Data miners simply do this for the organisations.

How data miners mine?

Go top

 

1. The data miners get access to the client's database using OLAP (Online Analytical Processing) without hampering the live transactions
2. At first data miners prepare data for mining

2. They discover a model applying some algorithms

 

3. They impose this model and make a prediction
 

 

4. Through the whole process they use
 
Algorithms and suitable models
Suitable database programming language like SQL
Statistical Tools
Statistical programming language like SAS, SPSS

 Note: These diagrams are collected from internet

Who mines their data?

Go to

 

Government Organizations
Non-Government Organizations(NGO)
Research Institutions
Financial Institutions
 Corporations
Business Organizations
Hospitals
Airlines/Railways/Shipping and other Transport Companies
Those other who keep historical and longitudinal data

 

How privacy is maintained?

Go top

 

1.

To maintain privacy most commonly used databases are

      ORACLE
      MS SQL
 
    Sybase
      MS Access etc.
 

2.

A Database has following security feature

      Access authentication
      User privileges
      Encryption
 

3.

With the help of the database technology an organization can maintain following three level privacy

      Do not allow any data mining of customer's data
      Allow data mining for internal use
      Allow data mining for both internal and external uses.

 

How an organization can be benefited?

Go top

   

1.

In modern culture most of the organizations have
Client information database
Client transaction history
Client's CIB Information particularly for banks
Client performance information
Detail information on Income and expenditure.
 

2. No need to spend money to gather information.
 
3.

They can mine into their existing large database which usually remains idle and may discover
the unexpected way to minimize expenditure
the tricky way to assess real depositors or good borrower
probabilistic behavior of defaulters
where to give more attention now etc.

 

Conclusion

Go top

At this moment if you apply data mining techniques over your existing database you may have some tricky answers of the following questions:

         Who should be issued credit card

         which client should be given more credit limit

         which group of people will be your target to provide the above facilities

         what type of products should be introduced for what type of customers.  

 
 
   
 

Copyright @ 2005 Md. Mahbubul Amin Majumder. All rights reserved.