I did my bachelor's in Punjab, India and worked as a decision consultant for three years after at MuSigma. After that, I decided to study Data and Knowledge Engineering at the Otto-Von-Guericke University (OVGU) Magdeburg, where I finished my degree in 2022. I did a lot of research as a HiWi on Natural Language Processing and Time Series Prediction. In addition, I worked as a Machine Learning, Data Mining and Deep Learning tutor for several terms. Currently, I am pursuing my PhD at the DZHW in cooperation with the OVGU.
Saijal Shahania
Abteilung Infrastruktur und Methoden
wissenschaftliche Mitarbeiterin
- 0511 450670-0
- 0511 450670-960
Liste der Projekte
Liste der Publikationen
6 Übereinstimmungen gefunden /
WISHFUL - Website extraction of Institutional Sources with Heterogeneous Factors and User-Driven Linkage.Shahania, S., Spiliopoulou, M., & Broneske, D. (2023).WISHFUL - Website extraction of Institutional Sources with Heterogeneous Factors and User-Driven Linkage. In Delir Haghighi, P. et al. (Hrsg.), Information Integration and Web Intelligence (iiWAS 2023) (S. 20-26). Cham: Springer. https://doi.org/10.1007/978-3-031-48316-5_3 Abstract
Extracting information from diverse websites is increasingly important, especially for analyzing vast data sets to detect trends, gain insights. By studying job ads, researchers can monitor employer demand shifts, assisting policymakers in aiding affected workers and industries. However, extraction faces challenges like varied website formats, dynamic content, and duplicate data. This study introduces a method for extracting data from diverse private university websites involving keyword identification, website categorization, and extraction pipelines. |
FACADE: Fake articles classification and decision explanation.Shahania, S., Purificato, E., Thiel, M., & William De Luca, E. (2023).FACADE: Fake articles classification and decision explanation. In J. Kamps et al. (Hrsg.), Advances in Information Retrieval (S. 294-299). Cham: Springer. https://doi.org/10.1007/978-3-031-28241-6_29 |
Tell me why it’s fake: Developing an explainable user interface for a fake news detection system.Shahania, S., Purificato, E., & William De Luca, E. (2022).Tell me why it’s fake: Developing an explainable user interface for a fake news detection system. In CEUR Workshop Proceedings (Hrsg.), Proceedings of the 3rd Italian Workshop on Explainable Artificial Intelligence (XAI.it 2022). Udine, Italy: CEUR. Abstract
In this paper, we present the design and development of an explainable user interface for a fake news detection system. The problem of distinguishing real from fake articles gained a lot of popularity in the last few years, mainly due to the soaring diffusion of social networks and internet bots as means for propaganda and disinformation sharing. By leveraging various explainability methods, i.e. feature importance, partial dependence plots and SHAP values, we aim to show how the combination of different techniques embedded in an interactive user interface can lead to enhance trust in a detection system for a non-expert user, such as a fact-checker or a content manager. Through several examples, we describe all the explainability component |
Predicting ecological momentary assessments in an app for Tinnitus by learning from each user's stream with a contextual multi-armed bandit.Shahania, S., Unnikrishnan, V., Pryss, R., Kraft, R., Schobel, J., ... & Spiliopoulou, M. (2022).Predicting ecological momentary assessments in an app for Tinnitus by learning from each user's stream with a contextual multi-armed bandit. Frontiers in Neuroscience Sec. Auditory Cognitive Neuroscience, 2022(16), 1-17. https://doi.org/10.3389/fnins.2022.836834 |
Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization.Wehnert, S., Sudhi, V., Dureja, S., Kutty, L., Shahania, S., & W. De Luca, E. (2021).Legal norm retrieval with variations of the bert model combined with TF-IDF vectorization. In Association for Computing Machinery (Hrsg.), ICAIL '21: Proceedings of the Eighteenth International Conference on Artificial Intelligence and Law, São Paulo, Brazil (S. 285-294). New York, NY, United States: Association for Computing Machinery. https://doi.org/10.1145/3462757.3466104 Abstract
In this work, we examine variations of the BERT model on the statute law retrieval task of the COLIEE competition. This includes approaches to leverage BERT's contextual word embeddings, fine-tuning the model, combining it with TF-IDF vectorization, adding external knowledge to the statutes and data augmentation. Our ensemble of Sentence-BERT with two different TF-IDF representations and document enrichment exhibits the best performance on this task regarding the F2 score. This is followed by a fine-tuned LEGAL-BERT with TF-IDF and data augmentation and our third approach with the BERTScore. We show that there are significant differences between the chosen BERT approaches and discuss several design decisions in the context of statute law. |
User-centric vs whole-stream learning for EMA prediction.Shahania, S., Unnikrishnan, V., Pryss, R., Kraft, R., Schobel, J., ... & Spiliopoulou, M. (2021).User-centric vs whole-stream learning for EMA prediction. In IEEE (Hrsg.), 2021 IEEE 34th International Symposium on Computer-Based Medical Systems (CBMS). Aveiro, Portugal: IEEE. https://doi.org/10.1109/CBMS52027.2021.00033 |
Liste der Vorträge & Tagungen
6 Übereinstimmungen gefunden /
Employment
since 09/2022
Research Assistant, DZHW
- Similarity of Regulations and Content Extraction, Recognition and Explanation for Researchers
- Trend Recognition in the WissZeitVG-Discussion 'IchBinHanna' zooming in on aspects of recentdebates
- Web crawling of job postings for automatic evaluation
04/2021 - 12/2021
Student Assistant, Otto-von-Guericke-Universität, Magdeburg, Germany
Project Qualiman
- Long-term analysis of students on solid data in order to measure their progress, academic successand their study duration, comparing degrees to each other
- Usage of R and SQL to statistically evaluate the collected data, automate the recording andcombination of different sources and simplify derivable actions in a frontend application
04/2020 - 11/2021
Software Developer, LegalHorizon AG, Magdeburg, Germany
- Development of a tool for retrieval of law documents given legislative preparatory documents(e.g. proposals) from the EUR-Lex portal
- EUR-Lex is a website providing details about all public documents of the European Union as wellas existing laws and pending proposals
10/2019 - 07/2022
Teaching Assistant, Otto-von-Guericke-Universität, Magdeburg, Germany
Conducting exercise classes for Master and Bachelor study programs: Data Mining, Machine Learning, Knowledge Engineering and Digital Humanities, Deep Learning
10/2019 - 03/2020
Student Assistant, Otto-von-Guericke-Universität, Magdeburg, Germany
Data pre-processing, data extraction from the EUR-Lex website, text summarising, topic modellingand clustering the proposals based on law semantics for use in adhoc requests.
09/2019 - 12/2019
Software Developer, in4s GmbH, Magdeburg, Germany
Task was about camera calibration for an ongoing project with Volkswagen for getting the intrinsic, extrinsic, and distortion parameters for the cameras.
10/2019 - 03/2020
Senior Decision Scientist, Musigma Business Solutions, Bangalore, India
- Development of an analytical data warehouse together with stakeholders for increasing operational efficiency using SQL and Netezza as well JIRA and Bitbucket
- Optimization of legal spend, its efficiency and improvement of the audit process together with an Australian insurance company using SQL, Netezza and Cognos
- Proactive identification of fraudulent claims in the initial phases of the claim life-cycle with an estimated savings of around 4M for an Australian insurance company using Machine Learning (esp. Text Mining) methods in R and Python
Education
09/2022 - ongoing
PhD, DZHW, Hannover, Germany
10/2018 - 07/2022
Master of Science, Otto-von-Guericke-Universität, Magdeburg, Germany
- Otto-von-Guericke Scholarship 2020 for exceptional academic achievement and social engagement
- Focus on the fields of Machine Learning (especially Natural Language Processing, Text Mining) and Data Mining
- Master's thesis in the detection of fake news and explainability of the results
08/2011 - 05/2015
Bachelor in Information Technology, UIET, Panjab University, Chandigarh, Punjab, India