ML Talk

Hosted by Subhankar Mishra's Lab
People -> Tata Satya Pratheek, Subhankar Mishra

2021

Talks so far

  1. Talk 19 - Sahel Mohammad Iqbal
    • Date: Tuesday, Nov 23rd, 2021| 3:00 PM IST
    • Title: Neural Networks at a Fraction with Pruned Quaternions.
    • Abstract: Contemporary state-of-the-art neural networks have increasingly large numbers of parameters, which prevents their deployment on devices with limited computational power. Pruning is one technique to remove unnecessary weights and reduce resource requirements for training and inference. In addition, for ML tasks where the input data is multi-dimensional, using higher-dimensional data embeddings such as complex numbers or quaternions has been shown to reduce the parameter count while maintaining accuracy. For computer vision applications, since the inputs are arrays of pixels that are each made up of 3 channels, quaternions provide a natural way to represent the pixels as single entities. In this talk, I will present the results for pruning experiments conducted on real and quaternion-valued implementations of different architectures on object detection tasks. The results indicate that for very high model sparsities, quaternion models provide higher accuracies than their real counterparts.
    • Reference Paper(s): NA
  2. Talk 18 - Tata Satya Pratheek
    • Date: Monday. Nov 8th, 2021| 3:00 PM IST
    • Title: Prior Image-Constrained Reconstruction using Style-Based Generative Models.
    • Abstract: In this talk, we look at a framework to obtain a useful estimate of an object semantically related to another object's prior image, from highly incomplete imaging measurements using a style-based generative model. Obtaining useful information from incomplete imaging measurements of an object or ill-posed imaging inverse problem remains a "holy grail" of imaging science. We look into its details in the course of the talk.
    • Reference Paper(s): Paper Link
  3. Talk 17 - Saswat Das
    • Date: Tuesday, Sep 21st, 2021| 3:30 PM IST
    • Title: An Introduction to (Privacy in Data Analysis and) Differential Privacy.
    • Abstract: Data analysis and training of models often require working on datasets with sensitive/incriminating information that would compromise a person's privacy, finances, or their daily life if analysts are given access to said datasets in their raw form, so to speak. Therefore data anonymisation and preservation of privacy has been a long running line of research, albeit various techniques such as anonymisation, providing summary statistics, query auditing etc. have been known to be susceptible to various (viz. linkage, reconstruction, differencing) attacks, and problems with practical implementation. In this talk, we shall discuss attempts at remedying this problem, and what went wrong with previously used approaches, and why differential privacy came to be, followed by a description of differential privacy, randomisation, and if time permits, a discussion about some of its primitives like mechanisms, compositions of mechanisms, etc.
    • Reference Paper(s): NA
  4. Talk 16 - Sahel Mohammad Iqbal
    • Date: Monday, Sep 20th, 2021| 3:30 PM IST
    • Title: Downsizing neural networks.
    • Abstract: It is a general trend that as neural networks get better at their respective jobs, the number of parameters that they have also go up, and with it the resources that are required to train and run inference. As an example, current state-of-the-art face detection models have upwards of a 100 million weights. In this talk we'll look at two methods to reduce the number of parameters of a neural network - pruning and using quaternions. I'll also cover the research questions that I look to explore in my M.Sc project, which is concerned with using the above two methods simultaneously to achieve an extremely slimmed-down (convolutional) neural network.
    • Reference Paper(s): NA
  5. Talk 15 - Nalin Kumar
    • Date: Monday, Sep 13th, 2021| 3:30 PM IST
    • Title: Simultaneous Machine Translation.
    • Abstract: Simultaneous Machine Translation (SiMT) involves translation of continuous stream of data. In this talk, we will discuss one of the earliest works in SiMT using NMT models. The authors showed that unlike previous methods which had two process (segmenting the continuous stream and translation of segments) of translating such data, using NMT models one can perform both the task jointly.
    • Slides: slides
    • Reference Paper(s): NA
  6. Talk 14 - Jyothish Kumar J
    • Date: Tuesday, Sep 7th, 2021| 2:30 PM IST
    • Title: Introduction to Robotic Prosthetics.
    • Abstract: The world of accessible prosthetic limbs mostly consistsof dummy legs and hands which are devoid of much function other than playing thecosmetic role of one’s structural completion. Robotic prosthetics are stillthis exclusive domain which is largely inaccessible to the majority due to itspoor affordability. This talk is going to be an introduction to my MSc. Thesisproject which aims at studying and understanding the recent developments andavailable options in the world of robotic prosthetics. The talk features someof the challenges and initial impressions regarding the problems to beovercome in order to design a lifelike robotic prosthetic limb which isaccessible to masses.
    • Slides: slides
    • Reference Paper(s): NA
  7. Talk 13 - Deepak Kumar
    • Date: Monday, Aug 30th, 2021| 3:30 PM IST
    • Title:
    • Abstract:
    • Reference Paper(s): NA
  8. Talk 12 - Annada Prasad Behera
    • Date: Monday, Aug 23rd, 2021| 3:30 PM IST
    • Title:
    • Abstract:
    • Reference Paper(s): NA
  9. Talk 11 - Annada Prasad Behera
    • Date: Monday, Aug 16th, 2021| 3:00 PM IST
    • Title:
    • Abstract:
    • Reference Paper(s): NA
  10. Talk 10 - Nalin Kumar
    • Date: Tuesday, Aug 10th, 2021| 3:00 PM IST
    • Title: Input Combination Strategies for Multi-Source Transformer Decoder.
    • Abstract: In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different in- put combination strategies for the encoder- decoder attention: serial, parallel, flat, and hi- erarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source base-lines.
    • Slides: slides
    • Reference Paper(s): Paper link
  11. Talk 9 - Deepak Kumar
    • Date: Monday, Aug 9th, 2021| 3:00 PM IST
    • Title: Optimizing Transformer for Low-Resource Neural Machine Translation.
    • Abstract: Language pairs with limited amounts of parallel data, also known as low-resource languages, remain a challenge for neural machine translation. While the Transformer model has achieved significant improvements for many language pairs and has become the de facto mainstream architecture, its capability under low-resource conditions has not been fully investigated yet. Our experiments on different subsets of the IWSLT14 training data show that the effectiveness of Transformer under low-resource conditions is highly dependent on the hyper-parameter settings. Our experiments show that using an optimized Transformer for low-resource conditions improves the translation quality up to 7.3 BLEU points compared to using the Transformer default settings.
    • Reference Paper(s): Paper link
  12. Talk 8 - Deepak Kumar
    • Date: Wednesday, June 2nd, 2021| 12:00 PM IST
    • Title: MIME: MIMicking Emotions for Empathetic Response Generation.
    • Abstract: Current approaches to empathetic response generation view the set of emotions expressed in the input text as a flat structure, where all the emotions are treated uniformly. We argue that empathetic responses often mimic the emotion of the user to a varying degree, depending on its positivity or negativity and content. Weshow that the consideration of these polarity-based emotion clusters and emotional mimicry results in improved empathy and contextual relevance of the response as compared to the state-of-the-art. Also, we introduce stochasticity into the emotion mixture that yield emotionally more varied empathetic responses than the previous work. We demonstrate theimportance of these factors to empathetic response generation using both automatic and human-based evaluations.
    • Reference Paper(s): Paper link
  13. Talk 7 - Annada Prasad Behera
    • Date: Wednesday, May 19th, 2021| 12:00 PM IST
    • Title: The math of differentiable rendering.
    • Abstract: In this weekly I will talk about the rendering equation and the anti-aliasing equation. And talk about how the anti-aliasing equation is approximated in real world and how such approximation break the differentiability of any renderer that tries to implement the anti-aliasing equation. Also I will talk about how we can mathematically preserve the differentiability by using Feynman's differentiation under the integral sign and the Reynolds Transport Theorem.
    • Reference Paper(s): Paper Link
  14. Talk 6 - Nalin Kumar
    • Date: Saturday, Feb 27th, 2021| 7:00 PM IST
    • Title: Data Augmentation for NMT.
    • Abstract: Neural Machine Translation (NMT) models are highly dependent on the size of the dataset. However, it is not always viable to get a huge amount of parallel sentences for a given pair of languages. This is where data augmentation techniques come to rescue. In this talk, we discuss about various data augmentation techniques used for NMT. We first briefly discuss the formal definition and go through the history of these techniques in NMT. We then discuss about back-translation, data diversification and cut-off, with their algorithms and results.
    • Slides: slides
    • Reference Paper(s): Paper 1, Paper 2, Paper 3
  15. Talk 5 - Annada Prasad Behera
    • Date: Monday, Feb 15th, 2021| 7:00 PM IST
    • Title: Synthesis of novel views by estimating radiance fields.
    • Abstract: In this talk, I'll briefly discuss the following (a) the rendering integral (b) the radiance function and finally talk about a (c) method for synthesizing novel views of complex scenes by optimizing an underlying continuous volumetric scene function using a sparse set of input views.
    • Reference Paper(s): NA
  16. Talk 4 - Rucha Bhalchandra Joshi
    • Date: Monday, Feb 8th, 2021| 7:00 PM IST
    • Title: Handling Missing Data with Graph Representation Learning
    • Abstract: Machine learning with missing data has been approached in two different ways, including feature imputation where missing feature values are estimated based on observed values and label prediction where downstream labels are learned directly from incomplete data. However, existing imputation models tend to have strong prior assumptions and cannot learn from downstream tasks, while models targeting label prediction often involve heuristics and can encounter scalability issues. Here we propose GRAPE, a graph-based framework for feature imputation as well as label prediction. GRAPE tackles the missing data problem using a graph representation, where the observations and features are viewed as two types of nodes in a bipartite graph, and the observed feature values as edges. Under the GRAPE framework, the feature imputation is formulated as an edge-level prediction task and the label prediction as a node-level prediction task. These tasks are then solved with Graph Neural Networks. Experimental results on nine benchmark datasets show that GRAPE yields 20% lower mean absolute error for imputation tasks and 10% lower for label prediction tasks, compared with existing state-of-the-art methods.
    • Slides: slides
    • Reference Paper(s): Paper Link
  17. Talk 3 - Aman Upadhyay
    • Date: Friday, Jan 15th, 2021| 7:00 PM IST
    • Title: Why are the lottery tickets winning?
    • Abstract: Neural networks are highly over parametrized and pruning such networks is a way to get faster inference and lower storage requirements for our networks. Winning lottery ticket is one such algorithm that prunes the network by finding the "important weights" in a network and retraining the network with these weights. We will briefly discuss the algorithm as proposed in paper 1 and 2 attached, and then try to find why it works (paper 3 and 4) and the shortcomings of the algorithm. There is a proposed algorithm in the 4th paper I have attached which tries to overcome this weakness.
    • Reference Paper(s): Paper 1, Paper 2, Paper 3, Paper 4
  18. Talk 2 - Jyotirmaya Shivottam
    • Date: Friday, Jan 8th, 2021| 7:00 PM IST
    • Title: Near Real-Time Incremental Learning for Object Detection at the Edge.
    • Abstract: One of the most interesting areas for application of object detection techniques is Internet of Things (IoT) or in a nutshell, "Edge Computing devices". Most of current research is focussed on tuning and compressing networks that can be run with limited resources on such devices, but this usually takes hours of training time. As such, a crucial component for AI on the edge has to be real-time Incremental Learning (IL) of new object classes. In this talk, the paper I will be presenting, explores this avenue and combats the problem of Catastrophic Forgetting (CF), by introducing a novel one-stage deep object detection algorithm, that incorporates some insights from the Learning without Forgetting (LwF) Knowledge Distillation technique. Additionally, the authors have designed an automated training dataset construction pipeline for the new object class, that ensures that the inference capability of the edge device is near real-time.
    • Slides: slides
    • Reference Paper(s): [arXiv]
  19. Talk 1 - Danush Shekar
    • Date: Friday, Jan 1st, 2021| 7:00 PM IST
    • Title: Fooling automated surveillance cameras: adversarial patches to attack person detection.
    • Abstract: We have all seen machine learning algorithms gaining widespread attention in multiple industries. Surveillance and security is one such area where machine learning has applications in. Recent research papers have been focussing on finding image patches for such algorithms which cause say, a classifier, to ignore the said object. The arXiv paper I will be presenting is one such example, wherein the authors present an approach to generate adversarial image patches that one can wear or hold to be hidden from a person-detection classifier.
    • Slides: slides
    • Reference Paper(s): [arXiv] Link