CS Katha Barta | ସଂଗଣକ ବିଜ୍ଞାନ କଥା ବାର୍ତା

Hosted by Subhankar Mishra's Lab
People -> Rucha Bhalchandra Joshi, Subhankar Mishra

CS Katha Barta 2024

Upcoming Talks

Past Talks

  1. Dr. Vinod Kurmi Assistant Professor, IISER Bhopal
    • Date: April 19, 2024, 10:30 hours
    • Title: Deep Fair Learning
    • Abstract

      Deep neural networks trained on biased data often inadvertently learn unintended inference rules, particularly when labels are strongly correlated with biased features. While several approaches have been proposed, one view towards mitigating bias is through adversarial learning. The main drawback of the adversarial method is that it directly introduces a tradeoff with accuracy, as the features that the discriminator deems to be sensitive to discrimination or bias could be correlated with classification. In our work we show that a biased discriminator can actually be used to improve this bias-accuracy tradeoff. Specifically, this is achieved by using a feature masking approach using the discriminator's gradients. We ensure that the features favored for the bias discrimination are de-emphasized and the unbiased features are enhanced during classification. One issue with these methods is that they address bias indirectly in the feature or sample space, with no control over learned weights, making it difficult to control the bias propagation across different layers. Based on this observation, we introduce a novel approach to address bias directly in the model's parameter space, preventing its propagation across layers. Our method involves training two models: a bias model for biased features and a debias model for unbiased details, guided by the bias model. We enforce dissimilarity in the debias model's later layers and similarity in its initial layers with the bias model, ensuring it learns unbiased low-level features without adopting biased high-level abstractions. By incorporating this explicit constraint during training, our approach shows enhanced classification accuracy and debiasing effectiveness across various synthetic and real-world datasets of different sizes.

  2. Dr. Raghava Mutharaju Assistant Professor, CSE IIIT Delhi
    • Date: April 19, 2024, 14:30 hours
    • Title: Applications of Symbolic and Neuro-Symbolic AI
    • Abstract

      Heterogeneous data from different sources are often related to each other. In order to derive value, the data should be integrated, structured, and the relationships should be made explicit. Knowledge Graphs (KG) can play a key role in achieving these goals. In the first part of the talk, after briefly introducing Knowledge Graphs and Ontologies, I will discuss two use cases that make use of KGs. In the second part of the talk, I will discuss the advantages of combining the neural and the symbolic aspects of AI through two use cases.

  3. Dr. Thiparat Chotibut (Thip) Chula Intelligent and Complex Systems Lab, Chulalongkorn University Youtube Slides
    • Date: Apr 18, 2024, 09:30 hours
    • Title: From explainable NLP to quantum dynamics prediction: A two-way synergy between many-body quantum physics and temporal machine learning models
    • Abstract

      In this talk, we will discuss our recent work that highlights the fruitful interplay between many-body quantum physics and temporal machine learning models. The first part, "Quantum Meets Language," employs techniques from many-body quantum physics to enhance explainability in the common natural language processing task of sentiment analysis. We will examine how transforming a recurrent neural network model into its matrix product states counterpart can inform design principles and facilitate interpretable predictions in machine learning models for sentiment analysis [1]. The second part, "Forecasting Many-Body Quantum Dynamics with Machine Learning," delves into our data-driven approach that uses a variant of reservoir computing to accurately predict complex quantum many-body dynamics far into the future, circumventing the need for computing intermediate time steps that typically slow down classical simulations of such dynamics [2]. These findings not only demonstrate the capabilities of GPUs in advancing scientific research but also underscore the potential of these interdisciplinary approaches to research in AI, materials science, and quantum simulation. References: [1] J. Tangpanitanon et al, Explainable Natural Language Processing with Matrix Product States, New Journal of Physics, 24 053032, 2022 [2] A. Sornsaeng et al, Quantum Next Generation Reservoir Computing: An Efficient Quantum Algorithm for Predicting Quantum Dynamics https://doi.org/10.48550/arXiv.2308.14239

  4. Dr. Pawan Goyal Associate Professor IIT Kharagpur
    • Date: Mar 20, 2024, 18:00 hours
    • Title: Sanskrit and Computational Linguistics
    • Abstract

      The talk will focus on how to make Sanskrit manuscripts more accessible to end-users through natural language technologies. The morphological richness, compounding, free word orderliness, and low-resource nature of Sanskrit pose significant challenges for developing deep learning solutions. We identify fundamental tasks, which are crucial for developing a robust NLP technology for Sanskrit: word segmentation, morphological parsing, dependency parsing, syntactic linearisation. Next, we will present our framework using Energy Based Models for multiple structured prediction tasks in Sanskrit. Our framework expects a graph as input, where relevant linguistic information is encoded in the nodes, and the edges are then used to indicate the association between these nodes. Typically the state of the art models for morphosyntactic tasks in morphologically rich languages still rely on hand-crafted features for their performance. But here, we automate the learning of the feature function. The feature function so learnt along with the search space we construct, encodes relevant linguistic information for the tasks we consider. This enables us to substantially reduce the training data requirements to as low as 10% as compared to the data requirements for the neural state of the art models. Finally, the talk will also discuss some recent works which make use of the latest advances in deep learning for Sanskrit NLP, as well as interesting future directions in the field of Sanskrit Computational Linguistics.

  5. Dr. Bapi Chatterjee Assistant Professor, CSE IIIT Delhi
    • Date: Mar 14, 2024, 09:30 hours
    • Title: Dynamics of auxiliary parameters in distributed machine learning
    • Abstract

      Distributed systems are at the center stage of training today's machine learning models. Such settings include shared-memory and message-passing asynchrony, compression of gradients, local training to reduce communication, and combinations thereof. With problem-specific assumptions such as non-convexity and non-smoothness in place, taming the convergence of iterates under such system-dependent inconsistencies becomes challenging. In this talk, we present several algorithms with various system- and problem-generated analytical assumptions. We discuss a general strategy for constructing these algorithms drawing from their convergence theory. We discuss constructing an auxiliary global parameter in every case. We show that convergence of the distributed machine learning training algorithm can be tracked via the dynamics of the constructed parameter.

  6. Dr. Nidhi Tiwari Microsoft, India
    • Date: Jan 29, 2024, 13:30 hours
    • Title: Leveraging Open-Source Foundation models to develop intelligent applications Youtube
    • Abstract

      OpenAI ChatGPT and other foundation models have garnered widespread attention with their ability to respond effectively to a wide range of human questions, solving logical problems, providing reasoning for the solutions, generate images and so on. We all want to explore their capabilities and utilize them for developing intelligent features/products. However, we are constrained and delayed due the high cost, low training data, limited access and large size. The increasing number and variety of Open-source foundation models are good alternative for this. In this session we will look at some of the open source LLMs. We will touch upon a few ways to access, finetune and use them for projects. We will also look at some options that enable integration of LLMs in mobile applications.

  7. Dr. Amit Chintamani Awekar Associate Professor, IIT Guwahati
    • Date: Jan 25, 2024, 09:30 hours
    • Title: Addressing the data bottleneck in information extraction
    • Abstract

      Supervised Machine Learning tasks require annotated data for model training. Annotating large-scale data is both costly and error-prone. The annotation error issue becomes even more complex when the number of annotation labels is of the order of hundreds or thousands. As a result, absence of high-quality data becomes the real bottleneck in improving the model performance. In this talk, we will consider three scenarios for addressing the data bottleneck.

      1. Data annotations are noisy. However, we cannot afford to re- annotate the whole dataset. How do we re-annotate only a part of the data?
      2. The annotation labels fail to capture the fine semantics of data. How do we create new annotation labels that are appropriate for our task?
      3. None of the existing datasets are appropriate for our particular application. How do we create new datasets from scratch or merge multiple existing datasets?
      We will discuss these three scenarios in the context of a specific task of Information Extraction. It is the task of extracting structured information from unstructured natural language text.

  8. Prof. Animesh Mukherjee Professor, IIT Kharagpur
    • Date: Jan 10, 2024, 08:30 hours
    • Title: Vulnerabilities of LLMs in hate speech detection Youtube
    • Abstract

      Recently efforts have been made by social media platforms as well as researchers to detect hateful or toxic language using large language models. However, none of these works aim to use explanation, additional context and victim community information in the detection process. We utilise different prompt variation, input information and evaluate large language models in zero shot setting (without adding any in-context examples). We select two large language models (GPT-3.5 and text-davinci) and three datasets - HateXplain, implicit hate and ToxicSpans. We find that on average including the target information in the pipeline improves the model performance substantially (∼20-30%) over the baseline across the datasets. There is also a considerable effect of adding the rationales/explanations into the pipeline (∼10-20%) over the baseline across the datasets. In addition, we further provide a typology of the error cases where these large language models fail to (i) classify and (ii) explain the reason for the decisions they take. Such vulnerable points automatically constitute ‘jailbreak’ prompts for these models and industry scale safeguard techniques need to be developed to make the models robust against such prompts.


CS Katha Barta Past years