Disseration Defense Announcements

Candidate Name: Nicholas Horvath
Title: Integration of advanced manufacturing in the mechanical design of reflective optics
 October 05, 2020  9:30 PM
Location: Committee in person/Stream via MS Teams

This dissertation comprises a series of authored papers, delineated by Chapter, which include advanced manufacturing, both techniques and processes, throughout the design processes for the future development of high quality reflective optics. The dissertation includes a novel kinematic mount design used for manufacturing and metrology of a freeform optic, an experimental study on additively manufactured silicon carbide for optical applications, and a new design methodology for higher efficiency lightweight mirrors considering additive manufacturing as the main process chain. Freeform optics, additive manufacturing, and silicon carbide mirrors are disruptive technologies independently. The work described in this dissertation merges these disruptive technologies into a systematic framework that has the potential to revolutionize both the manufacturing process chain and the mechanical design of lightweight mirrors. The combination of the three papers of this dissertation lays foundational work in reflective optics for overcoming manufacturing challenges, and for advancing mechanical design in consideration of advanced manufacturing. The result is a significant advancement in the state of the art for the creation of silicon carbide, additively manufactured, high efficiency, freeform reflective optics.

Please Email nhorvat1@uncc.edu with subject line "Dissertation Defense" for the Teams information.

Candidate Name: Pengyu Ni
Title: Prediction of cis-regulatory modules in genomes
 October 02, 2020  10:00 AM
Location: https://uncc.webex.com/uncc/j.php?MTID=mda95e921de330b36667545c0f63cff08

Annotating all cis-regulatory modules (CRMs) and constituent transcription factor (TF) binding sites (TFBSs) in genomes is essential to understand genome functions, however, the task remains highly challenging. Here, we developed a new algorithm dePCRM2 for predicting CRMs and TFBSs by integrating numerous TF ChIP-seq datasets. We predicted 1,404,973 CRMCs. And dePCRM2 largely outperforms existing methods. Epigenomic marks play complex roles in cell fate determination. However, little is known about the sequence determinants defining them, we showed two types of convolutional neural networks (CNNs) for cell types and for histone marks are good strategies to uncover the sequence determinants and their importance and interactions. After developed pipeline for predicting the map of CMRs, and a strategy to pinpoint the importance of the motifs in the epigenetic marks in the CRMs, then a complete categorization of cis-regulatory modules (CRMs) and constituent TFBSs in the human and model organismes can facilitate characterizing functions of regulatory sequences in the organisms.To aid the use of these predicted CRMs and TFBSs by the research community, we developed an online database PCRMS (predicted CRMs).The PCRMS database can be a useful resource for the research community to characterize functions of regulatory genomes in important organisms.

Candidate Name: Maryam Tavakoli Hosseinabadi
Title: Heterogeneous Feature Integration for Regression in Multimodal Healthcare Applications
 September 24, 2020  6:00 PM
Location: Zoom link: https://uncc.zoom.us/j/93165187032

The increasing performance of feature extraction and regression modeling in various domains raises the hope for machine and deep learning to assist clinicians in numerous healthcare applications. However, the complex and multimodal nature of the problems and the scarce resource of high-quality labeled data in this domain introduces several challenges and limitations. These challenges, along with lack of interpretability, undermines the generalizability and usability of many state-of-the-art machine learning models.

This dissertation focuses on using multimodal sources of data for regression modeling in healthcare applications. The argument is that domain knowledge describes the nature of each modality's relationship with the target function. This relationship can characterize the appropriate level of representation and an efficient integration method. We define a framework with two heterogeneous modalities, one modality provides more local features, while another contains higher-level global information. We demonstrate the framework's applicability for multiple healthcare regression tasks.

In this framework, we propose two approaches for increasing the performance in the absence of large-scale data: leveraging the abstraction of the modality representations based on domain knowledge, and a tree-structure convolutional neural network for integrating the information from the heterogeneous modalities. This framework is discussed in more detail for two different cases of "Alzheimer's disease progression prediction" and "radiation therapy treatment planning." The former predicts a scalar target variable, while the latter approximates a two-dimensional one. The first application's performance is compared with the previous submissions for the same dataset; it outperforms the best-reported results.

Candidate Name: Shelvasha Burkes
Title: Comparative Analysis of Repeat Landscapes in Avena (Oat)
 September 23, 2020  10:00 AM
Location: Virtual

Avena sativa, or common oat, is a staple crop and member of the Poaceae or Grasses family. Following behind wheat, maize and rice, oats account for 10.5 million hectares of the world’s produced crops as of 2017. Phytocompounds such as β-glucan and other phytochemicals such as avenanthramides, vanillic, syringic, ferric, and caffeic acids are noted to benefit cardiovascular health or represent prospective benefactors to human health. However, further investigation into these potential benefiting factors requires research that surpasses past works in breadth and scope. Much has been done to bridge the gap in resources for oats, such as the development of high throughput markers, consensus linkage maps and most recently genome sequencing efforts, however the relative complexity of cultivated oat, an allohexaploid with high similarity subgenomes, provides additional challenges to the development of these resources. A final layer of complexity is the genome size of hexaploid oats, believed to be approximately 12.8 gigabases, of which a significant portion is composed of complex repetitive elements. Characterization of these highly complex regions is difficult as repetitive regions contained within reads are characteristically difficult to map, thereby complicating assembly efforts and resulting in misassembly and gaps. Through investigation of repetitive elements by utilization of a novel pipeline capable of offering enhanced resolution, novel information pertaining to repetitive elements were further examined within well-characterized Avena genomes, with this concluding with phylogenetic analyses examining evolutionary relationships between elements in efforts to bolster overall knowledge of the Avena family and the role of transposable elements throughout Avena.

Candidate Name: Erica Moody
 October 06, 2020  10:00 AM
Location: Online

The student teaching experience is one that is typically filled with a wide range of
triumphs and challenges, and novice student teachers (STs) tend to rely heavily on their
Cooperating Teachers (CTs) to help navigate the experience. CTs have a strong
influence on the development of STs; however, far too often CTs are under-prepared to
carry out the many duties required of them. Without adequate support and training for
CTs, STs may not receive the level of support needed to properly equip them with the
skills needed for the challenging first years of teaching. This study investigated the
training and support provided to CTs, and examined the challenges CTs faced during the
student teaching experience. This study also investigated two levels of training for CTs -
those who participated in the standard training provided by the Educator Preparation
Program (EPP) and those who completed additional training provided by the EPP through
the Teacher Education Institute (TEI), a multi-day summer institute. A total of 361 CTs
participated in this quantitative study and completed a survey about the training and
support received from the EPP, as well as challenges they encountered while supervising
STs. Results showed very few differences between TEI and non-TEI trained CTs; both
groups had mostly positive experiences and were mostly satisfied with the training and
support provided. CTs in both groups reported similar challenges related to preparation
areas such as edTPA, having difficult conversations with STs, and providing
feedback/coaching to them, suggesting that these areas may require additional support
and training prior to and during the student teaching experience.

Candidate Name: Riyi Qiu
Title: Modeling uncertainty in deep learning models of Electronic Health Records
 September 30, 2020  9:30 AM
Location: Zoom link: https://uncc.zoom.us/j/98399126024

Recent research development has demonstrated the advantages of deep learning models in prediction tasks on
electronic health records (EHR) in the medical domain. However, the prediction results tend to be difficult to
explain due to the complex neuron structures. Without the explainability and transparency, deep learning
models are not trustworthy or reliable for making real world decisions, especially the high-stakes ones in the
healthcare domain. To improve the trustworthiness of the deep learning model, quantifying the uncertainty is

In this dissertation work, we proposed several Bayesian Neural Network (BNN) structures to estimate the data
uncertainty and model uncertainty associated with the EHR data and deep learning models, respectively. We
also proposed Variational Neural Network (VNN) algorithms to estimate the uncertainty of the variables to
investigate the medical and temporal features that contribute the most to the patient-level uncertainty. In order
to verify the validity of the uncertainty estimations, we designed a series of experiments to examine the
computational results against widely accepted facts about uncertainty. We also conducted post-hoc analysis to
evaluate whether the proposed models tend to specialize in one or more patient subgroups, at the cost of
model performance on others, as well as whether the treatment (improving uncertainty in one subgroup) will
mitigate such performance cost. The experiment results have confirmed the validity of our computational
approaches. Finally, we conducted a user study to understand the clinicians' perception of the proposed
uncertainty models.

Candidate Name: Jing Chen
Title: Assessments of indel annotation programs and comparative somatic indel analysis in cancer genomes
 September 25, 2020  10:00 AM
Location: Online defense

Insertions and deletions (indels) represent the second largest variation type in human genomes and have been implicated in the development of cancer. Accurate indel annotation is of paramount importance in variants analysis in both healthy and disease genomes. Previous studies have shown that existing indel calling methods generally produce high false positives and false negatives, which limits the downstream investigation of the roles of indels in structural and functional effects.
To assess the accuracy of indel calling programs, we carried out a comparative analysis by evaluating 7 general indel calling programs and 4 somatic indel calling programs, using 78 healthy samples from the 1000 Genomes Project and 30 cancer samples from The Cancer Genome Atlas (TCGA). We adopted a comprehensive and more stringent indel comparison approach, and an efficient way to use a benchmark for improved performance comparisons for the general indel calling programs. We found that germline indels in healthy genomes derived by combining several indel calling tools could help remove a large number of false positive indels from individual programs without compromising the number of true positives. The performance comparisons of somatic indel calling programs are more complicated due to the lack of reliable and comprehensive benchmark.
We further performed a comparative analysis of somatic indels in two cancer types, BRCA and LUAD. We compared somatic indels in both coding and non-coding regulatory regions such as transcription factor binding sites (TFBSs). We used an improved algorithm to predict TFBSs in human genomes and analyzed their evolutionary and structural roles. Our comparative results indicated that while there are differences between LUAD and BRCA genomes, both of them show a higher deletion rate, coding indel rate and frame-shift indel rate. Somatic indels tend to locate in sequences with important functions, including both coding and non-coding regions. This study can serve as the first step in future pan-cancer analysis for identifying key variant markers of cancer genomes.

Candidate Name: Zijing Lin
 October 01, 2020  8:00 AM
Location: Online through webex

This research focuses on evaluating the potential use of crowdsourced bike data and comparing them with the traditional bike counting data that are collected in the City of Charlotte, NC. Using the bike data from both the Strava smartphone cycling application and the bicycle count stations, the bicycle volume models are developed. Based on the results, a bicycle volume predictive model is presented, and a map illustrating the bicycle volume on most of the road segments in the City of Charlotte is generated. In addition, to gain a better understanding of the attributes that have an impact on cycling, other supporting data are also collected and combined with the Strava bicycle count data. Multiple discrete choice models are developed to analyze the Strava users’ cycling activities. Furthermore, bicyclist injury risk analysis is also conducted to explore the impact factors affecting biking safety by developing a series of safety performance functions. Several indicators for model comparison are utilized to select the best fitting model for bicyclist injury risk modeling. Finally, recommendations are made in order to help improve the cycling environment and safety and increase the bicycle volume in the future.

Candidate Name: Pengfei Liu
Title: Impact of Connected and Autonomous Vehicles on Mobility of Highway Systems
 October 06, 2020  8:00 AM
Location: Remote through Webex

Connected and autonomous vehicle (CAV) technologies are known as an effective way to improve safety and mobility of the transportation system. As a combination technology of connected vehicle and autonomous vehicle, CAVs share real time traffic data with each other, such as position, speed, and acceleration. CAV only needs a smaller lane width and headway which will lead to a higher roadway capacity. CAVs may have coordinated weaving maneuvers which will increase weaving section capacities. Also, CAVs enable the communication between vehicles and traffic signals. The coordinated operation among CAVs and the communication between CAVs and traffic signals will improve the throughput at signalized intersections and lead to a higher intersection capacity. To quantify the impact of CAVs on freeway capacity and intersection mobility, new guideline should be established in order to be suitable for use in conducting various types of analyses involving CAV strategies. The impact of different CAV penetration rates in the highway system on various facilities under different scenarios should be examined. The results of this research could lead to a better understanding of how CAVs will improve mobility on the highway systems.

Candidate Name: Yang Li
 September 29, 2020  8:00 AM
Location: Remote Defense via WebEx - Link: https://uncc.webex.com/uncc/j.php?MTID=m2455a98565297b0248ae1eb3cfc0847b

As one of the most vulnerable entity within the transportation system, pedestrians might face more dangers and sustain severer injuries in the traffic crashes than others. The safety of pedestrians is particularly critical within the context of continuous traffic safety improvements in US. Moreover, traffic crash data are inherently heterogeneous, and such data heterogeneity can cause one to draw incorrect conclusions in many ways. Therefore, developments and applications of proper modeling approaches are needed to identify causes of pedestrian-vehicle crashes to better ensure the safety of pedestrians.

On the other hand, with the development of artificial intelligence techniques, a variety of novel machine learning methods have been established. Compared to conventional discrete choice models (DCMs), machine learning models are more flexible with no or few prior assumptions about input variables and have higher adaptability to process outliers, missing and noisy data. Furthermore, the crash data has inherent patterns related to both space and time, crashes happened in locations with highly aggregated uptrend patterns should be worth exploring to examine the most recently deteriorative factors affecting the pedestrian injury severities in crashes.

The major goal of this study is intended to develop a framework for modeling and analyzing pedestrian injury severities in single-pedestrian-single-vehicle crashes with providing a higher resolution on identification of contributing factors and their associating effects on the injury severities of pedestrians, particularly on those most recently deteriorative factors. Developments of both conventional DCMs and the selected machine learning model, i.e., XGBoost model, are established. Detailed comparisons among all developed models are conducted with a result showing that XGBoost model outperforms all other conventional DCMs in all selected measurements. In addition, an emerging hotspot analysis is further utilized to identify the most targeted hotspots, followed by a proposed XGBoost model that analyzes the most recently deteriorative factors affecting the pedestrian injury severities. By completions of all abovementioned tasks, the gaps between theory and practice could be bridged. Summary and conclusions of the whole research are provided, and further research directions are given at the end.