Research on time series data is one of the interesting directions in many industries; these studies are based on certain background knowledge. This chapter provides several related issues of knowledge discovery process in general and a summary of the literature discussing various time series data mining analysis techniques in particular. It covers briefly the time series issues related to research on the representation and dimensionality reduction techniques, indexing structures, similarity measures, segmentation and others. This chapter also supplies the thesisrsquo; motivation, objective and the organization with a short introduction of each chapter.
At the current time, the fast increasing computing power and the decreasing costs of high volume data storage allowed the collection of the very large amounts of data. This situation has led to the relatively new field of data mining and knowledge discovery in order to make understandable of these data and take advantage of it. Data mining is the process of looking for hidden information, patterns or structures in a large data set. This process involves several tasks for example: extraction, selection, preprocessing, and transformation of characteristics describing a different datarsquo;s point of view. Knowledge discovery is a mining of data with the main purpose of finding and getting knowledge with novel, useful, interesting, understandable, and automatically explainable patterns.
The various data in many areas impose new challenges and requirements for the data mining task due to the special characteristics at the moment. A special particularly interesting category of dynamic data is a data stream that continuously flows in and out of systems at high speed (such as Internet traffic data, sensor data, position tracking data and so on) and itrsquo;s usually impossible to store them all or scan them multiple times. In many data mining techniques, discovery of knowledge from time series is still an interesting task and needs to be discussed.
Business intelligence is one of the information technology fields that has grown
and evolved the most in the last few years. Through business intelligence it is possible to improve the decision making process in virtually any department, organization or industry. Organizations have seen how business intelligence has changed over this time, how the tools have evolved offering more functionality to the analysts, and at the same time, providing solutions for more users. Information requirements also have grown exponentially, while only a few gigabytes of data and rapidly moving into the bigger range. With these scenarios, the traditional strategies were not fast enough to satisfy the business needs most of the time, thus a new flexible technological or algorithm approach was needed to address this challenge.
Consider the beginning step for statistical data analysis is data collection, in this step, the raw data will be collected from different data source, preprocessing it and then making further options on what methods to take advantages of feature selection have their executing with the results. If this step collects the incorrect data, the resulting knowledge may not be the most accurate. And so the decisions made based on that knowledge could be inaccurate as well. There are many methods and tools used for data collection. First of all data collection should be a procedure in knowledge management process . These procedures should be properly documented and followed by people involved in the data collection process. The data collection procedure defines certain data collection points. The data extraction techniques and tools are defined depending on data collection points. In addition to data collecting points and extraction mechanism, data storage is also defined in this step. Most of the organizations now use a software database application for this purpose.
Many different techniques for explorative data analysis can be applied in data processing step. The descriptive statistics are calculated such as mean, variance, and range of the variables. In this step, dimensionality reduction is also concerned to convert data of very high dimensionality into data of much lower dimensionality such that each of the lower dimensions conveys much more information. The reduction of the original seriesrsquo; dimensionality is very important because the access of spatial methods performs properly when the number of dimensions is as low as possible. In some applications, reducing the dimension of the data is only useful if the original features can be continued to use. If these considerations are not of concern, other techniques which reduce the dimension of a data set should be considered. There are a lot of approaches for dimensionality reduction, the need is to find an approach to reduce dimensions but can keep the shape of the original data and remove some un-important points.
Forecasting is the process of reporting on events that actual results have not been
observed. Prediction is a similar term of forecasting, but more general. Both can refer to formal statistical methods employing time series, cross-sectional or longitudinal data. Risk and uncertainty are central to forecasting and prediction; it is generally considered good practice to indicate the degree of uncertainty associated with forecasts. Time series prediction is usually understood as prediction the next few values of a numeric or symbolic series. For the prediction of numeric time series there is a huge amount of literature especially in the field of statistics.
The motivation is derived from these issues; this thesis presents the theoretical analyzing of time series and then applies them to process the environment which data set consists of multiple time series. The main tasks in this environment include common functions such as data acquisition, data preprocessing, data mining operations for the prediction and
剩余内容已隐藏,支付完成后下载完整资料
Research on time series data is one of the interesting directions in many industries; these studies are based on certain background knowledge. This chapter provides several related issues of knowledge discovery process in general and a summary of the literature discussing various time series data mining analysis techniques in particular. It covers briefly the time series issues related to research on the representation and dimensionality reduction techniques, indexing structures, similarity measures, segmentation and others. This chapter also supplies the thesisrsquo; motivation, objective and the organization with a short introduction of each chapter.
At the current time, the fast increasing computing power and the decreasing costs of high volume data storage allowed the collection of the very large amounts of data. This situation has led to the relatively new field of data mining and knowledge discovery in order to make understandable of these data and take advantage of it. Data mining is the process of looking for hidden information, patterns or structures in a large data set. This process involves several tasks for example: extraction, selection, preprocessing, and transformation of characteristics describing a different datarsquo;s point of view. Knowledge discovery is a mining of data with the main purpose of finding and getting knowledge with novel, useful, interesting, understandable, and automatically explainable patterns.
The various data in many areas impose new challenges and requirements for the data mining task due to the special characteristics at the moment. A special particularly interesting category of dynamic data is a data stream that continuously flows in and out of systems at high speed (such as Internet traffic data, sensor data, position tracking data and so on) and itrsquo;s usually impossible to store them all or scan them multiple times. In many data mining techniques, discovery of knowledge from time series is still an interesting task and needs to be discussed.
Business intelligence is one of the information technology fields that has grown
and evolved the most in the last few years. Through business intelligence it is possible to improve the decision making process in virtually any department, organization or industry. Organizations have seen how business intelligence has changed over this time, how the tools have evolved offering more functionality to the analysts, and at the same time, providing solutions for more users. Information requirements also have grown exponentially, while only a few gigabytes of data and rapidly moving into the bigger range. With these scenarios, the traditional strategies were not fast enough to satisfy the business needs most of the time, thus a new flexible technological or algorithm approach was needed to address this challenge.
Consider the beginning step for statistical data analysis is data collection, in this step, the raw data will be collected from different data source, preprocessing it and then making further options on what methods to take advantages of feature selection have their executing with the results. If this step collects the incorrect data, the resulting knowledge may not be the most accurate. And so the decisions made based on that knowledge could be inaccurate as well. There are many methods and tools used for data collection. First of all data collection should be a procedure in knowledge management process . These procedures should be properly documented and followed by people involved in the data collection process. The data collection procedure defines certain data collection points. The data extraction techniques and tools are defined depending on data collection points. In addition to data collecting points and extraction mechanism, data storage is also defined in this step. Most of the organizations now use a software database application for this purpose.
Many different techniques for explorative data analysis can be applied in data processing step. The descriptive statistics are calculated such as mean, variance, and range of the variables. In this step, dimensionality reduction is also concerned to convert data of very high dimensionality into data of much lower dimensionality such that each of the lower dimensions conveys much more information. The reduction of the original seriesrsquo; dimensionality is very important because the access of spatial methods performs properly when the number of dimensions is as low as possible. In some applications, reducing the dimension of the data is only useful if the original features can be continued to use. If these considerations are not of concern, other techniques which reduce the dimension of a data set should be considered. There are a lot of approaches for dimensionality reduction, the need is to find an approach to reduce dimensions but can keep the shape of the original data and remove some un-important points.
Forecasting is the process of reporting on events that actual results have not been
observed. Prediction is a similar term of forecasting, but more general. Both can refer to formal statistical methods employing time series, cross-sectional or longitudinal data. Risk and uncertainty are central to forecasting and prediction; it is generally considered good practice to indicate the degree of uncertainty associated with forecasts. Time series prediction is usually understood as prediction the next few values of a numeric or symbolic series. For the prediction of numeric time series there is a huge amount of literature especially in the field of statistics.
The motivation is derived from these issues; this thesis presents the theoretical analyzing of time series and then applies them to process the environment which data set consists of multiple time series. The main tasks in this environment include common functions such as data acquisition, data preprocessing, data mining operations for the prediction and
剩余内容已隐藏,支付完成后下载完整资料
资料编号:[479747],资料为PDF文档或Word文档,PDF文档可免费转换为Word

 
                