Research Interest

Data Mining and Knowledge Discovery
Big Data
Data Science
Geographic Information System (GIS)
Computer Modeling and Simulation

Undergraduate Research Projects

Machine learning in robotics

The goal of this project is to investigate the combination of reinforcement learning and radial basis function neural network (RBFNN) learning to improve the performance of a robot’s task execution. Due to the benefits of a controlled environment and controlled variables, the robot has been ported to a simulator called VREP. The state is evaluated using ultrasonic sensors, positioned on the front, left and right sides of the robot. The robot evaluates the distances returned from all sensors and calculates its orientation and distance from the wall, combining them into a state. It processes this state through the reinforcement learning algorithm to decide on an action to take. The knowledge gained through reinforcement learning is then used to train the RBFNN, which refines the actions corresponding to each state to improve the robot's performance. This project is presented at Texas ACET 2016 annual computing conference.

Programming is a Snap!: Increasing knowledge and Interest in Computing

This project investigates whether high school students’ interest and knowledge in computing can be increased by engaging them in an hour-long hands-on game programming lab that is led by undergraduates. Undergraduates create the instructional materials, conduct the hands-on activity and participate in evaluating the effectiveness of the approach. The instructional materials include a completed game, partial versions of the game for students to expand, PowerPoint slides, hands-on exercises, pre- and post- participation questionnaires. The pre and post questionnaires are used to assess whether participation in the learning activity increased the students’ knowledge and interest in computing. We have conducted five workshops with 230 high school students. Formal assessment results indicate that the workshops have the potential to have a significant effect on increasing knowledge of programming concepts. This project was presented at ACM 16th Annual Conference on Information Technology Education (SIGITE 2015) conference.

Design and implement of clustering algorithms for big data analysis

Clustering is very useful unsupervised learning technique. It is one of the fundamental data mining tasks for data analysis. As the size of data becomes extremely large nowadays, it is inefficient or even impossible for large-scale data to be stored and processed on a single machine. Therefore, the scalability problem of clustering algorithm running on a single machine has to be addressed. The goal of this project is to improve the traditional clustering algorithm by utilizing high-performance computing clusters and powerful programming platforms, such as MapReduce and Spark, for big data analysis. In particular, we design the MapReduce-based clustering algorithm for big data analysis. This project is supported by Lamar University McNair Scholar Program. This work has been presented at Lamar University 2017 Undergraduate Expo and won second place of undergraduate poster presentation.

Data mining for environmental science

Air quality is very important in the ongoing efforts to pursue a cleaner and healthier environment. This research intends to analysis the air pollution data and to make predictions for future utilizing data mining techniques. The goal of this project is to design and develop a data mining framework for analyzing air pollutant data in South Texas to provide information about the quality of air in this area and to extract knowledge from those data involving pollution events in this area and help answer different analytical questions from domain experts. The goal of this project is to help analyst find interesting patterns from the air pollution data and make primary predictions for the future utilizing data mining techniques. This project is supported by Lamar University Office of Undergraduate Research.

Graduate Research Projects

Clustering on Hadoop

We designed a parallel density based clustering algorithm using MapReduce called MR-SNN for big data analysis.

Data Mining for Environmental Science

We developed a data mining framework for air quality data analysis, which include data preprocessing, clustering, post-processing analysist, and visualization techniques.

Social Network Analysis

We developed a data mining framework for Location-based Social Media Analysis on Hadoop Ecosystem.

Change Analysis

We introduced a Polygon-based Change Discovery Algorithm called Poly-CD to identify the change patterns within spatial temporal clusters.

Spatial Temporal Clustering

We introduced a density based spatial temporal cluttering algorithm called ST-SNN to identify spatial temporal clusters.

Polygon Spatial Clustering

We developed a density based spatial clustering algorithm called POLY_SNN for polygons (hotspots) to generate spatial clusters.

Computer Modeling and Simulation

We introduced a systematic methodology based on computer modeling and dynamic simulation for CPI plant flare minimizations under abnormal operating conditions.

Data Mining for Urban Planning

We introduced a spatial clustering approach called CLEVER to discover interesting regions and regions which serve different functions in cities.