Paper Title: Immutable Infrastructure with Actionable Monitoring on Containers (Kubernetes)

 Author(s) and Affiliation: Mizan Hemani, Minnesota State University, Mankato, mizan.hemani@mnsu.edu

Abstract: With the dawn of cloud computing and the growing popularity of containers that run applications and microservices – it has become easier to build new architectures that are deployable as smaller cohesive segments that are highly scalable. Having this container level deployment makes it easier to manage deployments between different environments, however, it carries forward the existing behaviors of directly interacting with the server, while avoiding the pre-configured deployment pipeline – potentially creating a drift in configuration and exposing the system to security vulnerabilities. In this paper, we explore the lack of immutability in a container infrastructure by monitoring audit level logs of interactions with Kubernetes to perform actions on established policies. By leveraging such policies, this paper proposes a pattern that can ensure an intact infrastructure and re-enforce good security and system maintenance principles.

Keywords: Immutable Infrastructure, Containers, Kubernetes, Microservice, Architecture, Actionable Monitoring, Security in Containers

Paper Title: Delay Tolerant Network Security 

Author(s) and Affiliation: Rishabh Yata, Minnesota State University, Mankato, Rishabh.yata@mnsu.edu

Abstract: A  delay-tolerant  network  or  DTN  is  a  store  and  forward  network  where  end-to-end communication  is  not  assumed  and  where  data  transmission  is  performed  using opportunistic  connections  between  nodes.  DTN  is  a  sparse  wireless  network  that  has recently been used by the existing network to link devices or the underdeveloped world in a  challenging  environment.  In  any  protected  environment,  such  as  the  military,  the network security protocol is often needed. In DTN, the complete path from resource to target does not exist for the most part, which contributes to the difficulty of routing the packet  in  such  an  area.  For  the  large  implementation  of  delay-tolerant  networks, protection and privacy are essential. People are hesitant to consider such a new network model  without  protection  and  privacy  assurances.  Therefore,  in  this  paper,  I  plan  to discuss  various  security,  as  well  as  cryptography  concepts  and  protocols  which  are currently in use and propose some promising enhancement concepts to DTN security. 

Keywords: DTN, Delay Tolerant Networks, Security, Routing, IoT, Networking 

Paper Title: Using Prototyping to Teach Design Thinking  

Author(s) and Affiliation: Mary Lebens, Metropolitan State University, mary.lebens@metrostate.edu 

Abstract: Companies using design thinking increase revenues and shareholder returns at almost double the rate of their industry peers, yet more than 90% of companies do not employ design  thinking,  in  part  due  to  a  lack  design  skills  in  the  workforce.  Adding  design thinking  to  the  curriculum  is  imperative  to  address  this  skills  gap.  Most  research emphasizes developers and users physically working together, so it is significant to learn whether online students who are never physically present together in the classroom can successfully  learn  design  thinking  skills.  This  study  examines  whether  students  in  an “asynchronous  online”  undergraduate  systems  analysis  course  can  successfully  apply user-centered  design  standards  to  develop a system prototype.  Additionally, the study examines  if  students  are  able  to  provide  substantive  feedback  to  their  peers  on  their prototypes  while  participating  in  an  iterative  review  process.   The  study  method employed a model for prototype design, review, and assessment. The study demonstrates that over two course sections, the majority of students in an asynchronous online course successfully  developed  web  prototypes  that  employed  user-centered  design,  as  well  as effectively  providing  feedback  to  peers  on  their  prototypes  during  an  iterative  review process. The implication is faculty can feel confident in employing design thinking and prototyping in asynchronous online courses to teach these valuable skills. 

Keywords: Design Thinking, Prototyping, Iterative Review, Asynchronous Online Courses 

Paper Title: Evaluation of P2P Loan Default Detection Models

Author(s) and Affiliation: Queen E. Booker , Metropolitan State University ,Queen.booker@metrostate.edu , Mousumi Munmun , Metropolitan State University , Mousumi.munmun@metrostate.edu 

Abstract: The Peer-to-Peer (P2P) lending model is exploding in the US economy. A robust charge off/default detection method is needed to improve the quality of the P2P lending market and establish a more sustainable industry. The study specifically compares the Zhang (2020) Logistic Regression (LR) model to a Deep Learning Neural Network (DLNN) and Naïve Bayes (NB). However, based on the Lending Club dataset and Zhang’s (2020) variables, no model was particularly effective at detecting potentially bad loans.  

Keywords: Peer to peer Lending, Logistic Regression Model, Deep Learning Neural Network Model, Naïve Bayes Model.    

Paper Title: What does the Twitter sentiments say about the COVID-19 Vaccine? 

Author(s) and Affiliation: Ilma Sheriff, Computer Information Science, Minnesota State University, Mankato, MN, ilma.sheriff@mnsu.edu, Naseef Mansoor, Computer Information Science, Minnesota State University, Mankato, MN, naseef.mansoor@mnsu.edu 

Abstract: The  coronavirus  disease  (COVID-19)  pandemic  led  to  substantial  public  discussion. Understanding       these       discussions       can       help       institutions and       individuals navigate through this  pandemic. In  this  paper,  we  analyze  and  investigate  the  twitter sentiments toward COVID-19 vaccine. Starting from a publicly available twitter dataset on  COVID-19  vaccine  from  Kaggle,  we  create  a  unified  dataset  containing  data  about public  sentiments,  sentiment  scores,  and  COVID-19  cases  for  various  U.S.  states.  To generate a sentiment scores from the tweets, we have applied a Valence Aware Dictionary and sEntiment Reasoner (VADER) sentiment analyzer. These scores were then classified to  positive,  negative,  and  neutral  sentiment  classes  using  a  simple  threshold-based classifier. From our analysis, we observe that in our dataset around 41.93% of the tweets are  positive,  17.64%  tweets  are  negative,  and  40.42%  tweets  are  neutral.  We also analyzed   the   data   based   on   geographic   locations   of   the   tweets to   answer   the following questions – 1) Is there any relationship between the number of tweets and the number of COVID-19 cases? 2) Is there any shift in the public sentiment after the approval of the vaccine? Our analysis shows high correlation between the number of tweets and the  number  of  COVID-19  cases  as  well  as  a  decrease  in  negative  sentiment  after  the approval of the vaccine.  

Keywords: : COVID-19 Vaccine, Sentiment Analysis, VADER, Twitter Mining, NLP 

Paper Title: Automated stock recommendations using Financial Indicators and Machine Learning 

Author(s) and Affiliation: Utkarsh Sharma, B. Tech CSE ASET, Amity University, Noida  skhiearth@gmail.com, Simran Gogia B. Tech CSE ASET, Amity University, Noida, simrangogia20@gmail.com

Abstract: Stock   market   is   suggested   and   regarded   as   one   of   the   high-yielding   long-term investments, yet a majority of people don’t capitalize on the same. Dubious advice and attempts to ‘beat the market’ usually give rise to skepticism and distrust among first-time investors. This paper proposes a subjective, low-risk stock market advising platform that leverages Machine Learning clustering (K-Means) on basic Financial Indicators that are used to track the performance of stocks in the exchange to serve as an aid in investment decision,  particularly  for  first-time  investors.  The  results  suggest  that  clustering-powered subjective recommendations can prove to be a low-risk advising tool. 

Keywords:   Stocks, Recommendations, Finance, Analysis, Machine Learning, Clustering Algorithms 

Paper Title: Strategies that Guide the Availability, Information Security, and Scalability of Future Wireless Sensor Networks (WSNs)  

Author(s) and Affiliation: Sapumal Darshana Salpadoru Thuppahi,  Minnesota State university Mankato, sapumal.salpadoruthuppahi@mnsu.edu, Michael Hart, Minnesota State university Mankato, michael.hart-2@mnsu.edu 

Abstract: Wireless  Sensor  Networks  (WSNs)  facilitate  the  opportunity  for  industries  to  manage vast amounts of sensors over various types of computer networks.  New WSN research indicates   several   advantages   for   industries   currently   not   using   its   associated technological advancements.  To help these industries, the authors outline guidance that help inform future WSN implementation frameworks.  Using this guidance, the authors propose  an  iteration  of  a  new  WSN  model  for  agriculture.    The  prototype  addresses several  needs,  including  high  availability,  information  security,  and  scalability  of wireless sensor networks using commodity hardware often present in this industry. 

Keywords: Wireless Sensor Networks, Wireless Security, Wireless Scalability 

Paper Title:  Twitter Data Analysis about COVID-19 Vaccines using Sentiment Analysis 

Author(s) and Affiliation: Maharu Chamara Wickramarathne, Minnesota State university Mankato, maharu.wickramarathne@mnsu.edu 

Abstract: The world took tremendous measures to find a cure for COVID-19. After multiple attempts at vaccines against the virus, two vaccines got approved by Food and Drug Administration (FDA) and World Health organization to distribute in USA. They are the Pfizer/BioNTech COVID-19 vaccine and Moderna COVID-19 vaccine. But people are curious of lot questions about the vaccines (“What are the side effects?”). Addressing answers to these questions and doubts are necessary for successful vaccination of the people. This research is addressing to answer these questions using twitter data. Twitter data was analyzed by mining two thousand tweets (hash tag by vaccine name) in Minnesota State for each vaccine. These tweets revealed most people’s opinion about the vaccine and how well they performed. Twitter data mining and cleaning procedures in R was used to get a better insight. Use of Word Cloud data visualization technique and Sentimental Analysis methods helped to explore those questions among the people in Minnesota.  

Keywords: 

Paper Title:  The Impact of AES Encryption on SCADA Systems for Electrical Distribution that Contain HDFS Architecture 

Author(s) and Affiliation: Justin Wren, justinwren01@gmail.com, Michael Hart, Minnesota  State University,  Mankato, michael.hart-2@mnsu.edu 

Abstract: Supervisory   Control   and   Data   Acquisition  (SCADA)   systems   for   electrical   utility companies have  an increasing  need to provide additional insight into smart grid data.  A significant  contingency   is  the  ability   to  design  information   security  and  big   data architecture  into IT  infrastructure  that  demands  minimal  network  latency.  This  study explores  an  IT  infrastructure  design  for  electrical  generating  stations  that  have  the capability to stream encrypted internal SCADA data to a Hadoop Distributed File System (HDFS).  Using  the  design  science  research   methodology,   the  authors  designed  and implemented  an  IT  critical infrastructure that  uses the Advanced  Encryption Standard  (AES) between primary  SCADA systems and intelligent electronic devices (IEDs).  Results illustrate a  marginal  difference  in network  packet  latency  between  security  gateways that  load balance  individual relays  to IEDs and single  instance security gateways  that handle  all  relays  to IEDs  using a  LAN substation. Despite the  introduction of  network latency,   the   proposed   critical   IT   infrastructure   design   decreases   the   amount   of unencrypted data  in SCADA  environments  and  could allow  streaming  data  securely to HDFS. Findings emphasize  that carefully designing security gateways  and encryption in SCADA  systems is  a  viable  and necessary  step when  considering  streaming  data  from IEDs to big data environments. 

Keywords: Advanced Encryption Standard, Hadoop Distributed File System, Smart Grid, Supervisory Control and Data Acquisition 

Paper Title: Blockchain in COVID-19 Vaccine Distribution 

Author(s) and Affiliation: Tiati Thelen, Minnesota State University, Mankato, Tiati.Thelen@mnsu.edu, Rajeev Bukralia, Minnesota State University, Mankato, Rajeev.Bukralia@mnsu.edu 

Abstract: Supply chain management has started utilizing blockchain technology to access information from the start of production to the consumer. Blockchains create records of consistent information. Recently, blockchain technology has been introduced into the pharmaceutical supply chain to track temperatures of vaccines from production to patient. Additionally, IoT (Internet of Things) assists blockchains by utilizing embedded sensors and software to supply blockchains with the pertinent information. It is vital because vaccines are temperature sensitive. This research provides the foundations to consider these technologies in the domain of the COVID-19 vaccine which is unique such that many are produced in two doses. This paper contributes a systematic review of previous works and how it can effectively be advanced to the COVID-19 vaccine supply 

Keywords:  Pharmaceutical Supply Chain, COVID-19 vaccine, Blockchain, IoT 

Paper Title: Detecting Online Review Fraud Using Sentiment Analysis 

Author(s) and Affiliation: Bryn Caron, Minnesota State University, Mankato, bryn.caron@mnsu.edu, Rajeev Bukralia, Minnesota State University, Mankato,rajeev.bukralia@mnsu.edu 

Abstract: With the exponential increase in e-commerce, online reviews have become integral to the marketing of products and services. Customers are inclined to buy products and services that have received high ratings and positive reviews. Consequently, fake reviews are increasingly becoming a way to mislead customers into trusting, or mistrusting, the credibility and reliability of a product or service. Though online fake reviews have garnered some attention from the media and research communities, there is a need for effective technical solutions for detecting, and therefore mitigating, fraudulent reviews to improve consumer confidence in e-commerce. The purpose of this study is to explore the use of natural language processing techniques in detecting fake online reviews. We analyze the text of online reviews for various book titles. We investigate the accuracy of the polarity score, a common metric used in sentiment analysis, in the context of the star rating of the reviews. Our findings conclude that the polarity score is not a reliable measure for detecting fake reviews. In addition, the study sheds light on the limitations of sentiment analysis in detecting fake reviews. 

Keywords:  Fake Reviews, Sentiment Analysis, Natural Language Processing (NLP), E-commerce, Text Analytics, Text Mining  

Paper Title:  Ensemble Learning for Authorship Verification  

Author(s) and Affiliation: Abdul Wahab Mohammad, Minnesota State University, Mankato, abdulwahab.mohammad@mnsu.edu, Dr. Michael Hart, Minnesota State University, Mankato, michael.hart-2@mnsu.edu 

Abstract: Authorship verification is the task in which the author of a given text is identified. In this paper, the author proposes two novel methods to identify authors of the text on two different benchmark datasets namely C50 dataset and Guternberg dataset. The author used BERT which is the state-of-the-art NLP  model with Siamese networks and tf-idf with attention models. The BERT model has shown very good  results on the training data, but it did not generalize well on the testing data. However, the model with tf-idf and attention mechanism has managed to achieve comparable to state-of-the-art results on C50 dataset. This paper also discusses how word2vec based preprocessing approach works in identifying authors via Siamese networks.  

Keywords: Deep Learning, BERT, Authorship Verification, Siamese Networks, Attention Models 

Paper Title:   Chatbot Knowledge Retrieval Supported ByForums

Author(s) and Affiliation: Michael A. Nyakonu , Metropolitan State University, El8559ys@go.minnstate.edu 

Abstract:  In  the  paper  we  will  be  looking  at  how  implementing  a  chatbot  system  that  has  a dynamically  growing  pool  of  knowledge  can  be  developed.  We  shall  look  at  how  at  a forum’s <thread title and answer> structure can be used as a source of infinite knowledge. The answers will be derived through web crawling. In return we hope to demonstrate a new model that provides infinite knowledge base to the chatbot developers 

Keywords:  Chatbots, Web Crawlers, Forums 

Paper Title:   Game Prediction Model(s) for the National Basketball Association 

Author(s) and Affiliation: Qin Sun qin.sun@mnsu.edu, Logan Cook Logan.cook@mnsu.edu 

Abstract: According  to  Forbes  statistics,  there  are  750  million  families  watching  National  Basketball  Association (NBA  for  short)  games  in  212  countries.  The  NBA  has  become  the  most  globalized  and  influential professional sports organization in the world. As a sports league with an annual revenue of more than 4 billion U.S. dollars, predicting the outcome of NBA games is an interesting thing with great commercial value. In this article, we selected the team and player data for all seasons of the NBA from 2004 to 2020, using the R language, with thirty different data splits to bring thirty different accuracy to each model. Our conclusion shows that K-Nearest Neighbor Classifier has lowest prediction accuracy during these 4 models, while the SVM classifier has the most accurate effect. 

Keywords: Machine Learning; Algorithm; Support Vector Machine; Sports Analytics; Basketball; NBA; K-Nearest Neighbor; Decision Tree; Random Forest 

Paper Title:   Roadmap Comparison: Telehealth and NIST  

Author(s) and Affiliation: Pamal Wanigasinghe,Minnesota State University, Mankato, pamal.wanigasinghe@mnsu.edu, Sarah Klammer Kruse, PhD, RD, Minnesota State University, Mankato, sarah.kruse@mnsu.edu 

Abstract: Telehealth has great potential to increase patient access to health services, decrease costs, and improve individual and public wellbeing. In order to fully realize these advantages, patients need to be assured that their health-related data will be protected, and providers must take responsibility for the security and integrity of the data gathered. Adoption and use  of  telehealth  could  be  reduced  or  delayed  if  security  risks  are  not  adequately addressed.  As  the  popularity  of  telehealth  increases,  it  is  important  to  emphasize information  security  for  this  emerging  healthcare  technology.  A  successful  telehealth security plan should include all aspects of security including the underlying frameworks, policies,  and  education  of  providers  and  patients.  This  paper  explores  security  and privacy risks of telehealth and compares the telehealth roadmaps from two organizations to  the  recommendations  given  in  the  Roadmap  for  Advancing  the  NIST  Privacy Framework.  

Keywords:  Telehealth, Cyber Security, Security Roadmap 

Paper Title:   An OS Benchmark Design to Compare SQL Load on Distributed Big Data Systems 

Author(s) and Affiliation: Michael Hart, Minnesota State University, Mankato, michael.hart-2@mnsu.edu 

Abstract: Although vendors publish key benchmarks of big data systems, under typical industry load and fluctuating network environments results can differ.  This work develops a SQL load benchmarking process by employing the Design Science methodology.  The proposed experimental process measures varied operating systems under normal business load for a popular distributed big data system.  Using a modified version of the IBM supported TPC-DS workflow, the author tests SQL completion times on three separate Apache Spark distributed  clusters  running  Ubuntu  Server,  Clear  Linux,  and  CentOS  Server. Results indicate  load  in  real-life  big  data  environments  have  a  significant  effect  on  SQL completion times.    

Keywords:   Big data systems, Big SQL, benchmarking, distributed systems, parallel computing 

Research In Progress titles:

  • Parameter Estimation of Non-negative Matrix Factorization in NLP:  A Stochastic Approach
  • Predicting Postpartum Depression using Machine Learning 
  • Using Extreme Value Statistics to Assess Wildfire Risk in Colorado