الفهرس | Only 14 pages are availabe for public view |
Abstract Data streams gained obvious attention by research for years. Mining this type of data generates challenges because of their special nature. Because of their higher accurate results and greediness decision trees were among the most used techniques in classifying data streams. This dissertation provides a review for classification techniques in adaptive data stream mining. Focusing on both challenges ; concept drifts and dimensionality reduction and dividing these techniques into incremental and ensemble. Incremental classifiers such as Very Fast Decision Trees (VFDT) and Concept-adapting Very Fast Decision Trees (CVFDT) are tested. Adaptive Random Forests (ARF) was taken as an example for adaptive ensemble classifiers. Furthermore, an experimental analysis between VFDT, CVFDT and ARF is held. The analysis is according to accuracy, processing speed, and tree size. Accuracy did not vary much between the three algorithms. ARF has much better results in speed and has the smallest number of tree nodes. Then, we demonstrate the Very Fast Decision Trees (VFDT) as one of the most used algorithms for decision trees. On later step, we present VFDT-S1.0 as an extension of VFDT using bagging and sampling techniques. Finally, we make a simulation on the two algorithms according to accuracy and processing time. The experimental result proves that the proposed modification reduces time of the classification by more than 20% in more than one dataset. Effect on accuracy was less than 1% in some datasets. Time results proved the suitability of the algorithm for handling fast stream mining. |