Schedules

Event Schedule

New talks added every week. Please check back this page again.

All timings in IST

Expand All +
  • Day 1

    August 13, 2020

  • Deploying and managing models in production is often the most difficult part of the machine learning process. Enter model servers, which make it easy to load models and which offer production-critical features like logging, monitoring, and security. In this tech talk, learn more about the collaboration between AWS and Facebook to offer TorchServe, a PyTorch model-serving library that makes it easy to deploy trained models at scale without writing custom code
    Tech Talks

  • More details at : https://cvdc.adasci.org/workshop/
    Workshop

  • Extracting id from lakhs of physical scanned document is challenging and time consuming activity. Solution available in market can be costly. In the session we would walk through approach taken to extract ID (Photo, PAN Card, Address proof) from scanned copy of physical account opening form. The session covers computer vision approach taken to solve this, automate id extraction, deployment of solution and learning from our implementation. The second part of session covers masking Aadhar card number. We will also cover technical details of library, packages, model building steps and techniques used.
    Tech Talks

  • Deep Learning on Medical Imaging is a popular, fruitful and critical field with many SOTA research items being released everyday. But this research doesn’t translate easily or quickly into practical or production ready tech. To explain the challenges faced by applied machine learners and how they can overcome these challenges in real-time environments we will talk about the below items. We will also take an industry based deep dive into the problems faced by Medical Imaging researchers like me. Challenges with Medical Imaging data, which include building good training datasets and the wrong practices which lead to leakage as the most recent target leakage in a famous COVID x-rays prediction paper that I investigated. We will talk about sampling biases and strategies for medical image datasets. We talk about the industry standards for medical imaging and the challenges in that standard. We will talk about how even the SOTA pretrained model weights fail on medical imaging datasets. Why traditional feature engineering, image augmentation never works on Medical images. We will discuss data augmentation strategies used for different image types like x-rays, ct-scans and microscopic biopsy images. We will also talk about how to mimic real time imaging in datasets built in perfect lab conditions. We will touch upon some of the best architectures suitable for Deep Learning on Medical Images and why they work extremely well on Medical Imaging. And we conclude the discussion with what is available, what is lacking, what is worth solving as soon as possible in the field of Deep Learning on Medical Imaging.

  • ML / DL projects involve iterative and recursive R&D process of data gathering, data annotation, research, QA, deployment, additional data gathering from deployed units and back again. The strong coupling between data and model means various teams, with various backgrounds and capabilities, need to work together in a very tight manner to be able to do their job effectively. Without the use of a unifying R&D management tool, ongoing research operations in organizations are usually detrimental in the long run: Reduced collaboration, loss of work, irreproducible training, and a negative effect on overall effectiveness of the company. As such, companies must use R&D infrastructure tailored for AI projects that supports the R&D workflow from research to production and enables them to adapt their offering to the evolving demand. In this talk we share our experience from numerous deep learning projects and describe the features such infrastructure requires in order to boost productivity as well as be adaptive to the different R&D stages, from alpha version to a massively deployed product. Specifically, we will focus on these main subjects: data management, experiment management, resource management, ML pipelines and auto ML.
    Tech Talks

  • In the COVID 19 lock down period, the bulk of the population was forced to stay indoors at close proximity for prolonged period of time. There were reports about family feuds and violence happening at much higher frequency compared to the pre-Covid 19 period. This is a cause of major concerns to many social workers. Younger members of the population generally take to social media to air their frustrations and pain which provides a channel for social workers to reach out. However, social workers can only screen a limited amount of information on the social media. In this presentation, we will discuss how computer vision can be used to augment social worker’s ability to identify youth at risk using Computer Vision combined with Knowledge graph.
    Tech Talks

  • The healthcare industry has already seen many benefits coming from the rise of artificial intelligence (AI) solutions. One of the emerging AI fields today is computer vision, which can potentially support many different applications delivering life-saving functionalities for patients. Computer vision is today assisting an increasing number of doctors to better diagnose their patients, monitor the evolution of diseases, and prescribe the right treatments. The technology not only helps medical professionals to save time on routine tasks and give more time to the patients. The implications of computer vision for medical use based on tasks such as medical imaging analysis, predictive analysis, or healthcare monitoring suggest a host of benefits to the healthcare industry. The emerging field of computer vision focuses on training computers to replicate human sight and the understanding of objects in front of it. To accomplish that, computer vision takes advantage of artificial intelligence algorithms that process images. The goal of computer vision in healthcare is to make a faster and more accurate diagnosis than a physician could make. Currently, the most widespread use cases for computer vision and healthcare are related to the field of radiology and imaging. AI-powered solutions are finding increasing support among doctors because of their diagnosis of diseases and conditions from various scans such as X-ray, and MR, or CT. Physicians who take advantage of the computer vision technology will be able to analyze a wide range of health and fitness metrics to make better medical decisions. Today, such tools are used by healthcare centers that measure blood loss during surgery, e.g. during C-section procedures. Moreover, the technology can also be used to measure body fat percentage of people using images taken with regular cameras. Computer vision offers a wide range of applications in health monitoring, bringing doctors closer to their patients, helping them to save time on analyzing images, and providing them with more accurate data to work with. Code: https://github.com/akshitpriyesh/COVID-ENCOUNTER
    Tech Talks

  • The task addressed is that of counting repetitions of almost all types of exercises from a live video source. The solution uses Posenet, a tensorflow model, to extract human pose estimation data from each frame and successive frames are run against the algorithm proposed to get the live count of repetitions (reps) done until that frame. The solution runs completely offline and the method is proved to be robust enough to handle real world videos and does not require to be pre-trained except for the pose estimation part.
    Tech Talks

  • The examination of breast cancer by investigating histopathological images personate a serious role in the patient's development and deep learning tactics is used to get a set of parameters from images used to build deep convolutional networks. We adapted the shufflenet model from pre-trained models for the multi-class classification of histopathological images of breast cancer by using the transfer learning technique. All our results are used to show that transfer learning provides the finest examination of breast cancer images. The average accuracy of all classes of breast tumor cells is 95±% using MATLAB. In this paper, we have proposed our methods of using pretrained networks in the transfer learning process using MATLAB on histopathological images of breast cancer. All results shown in this paper are on the augmented dataset in multi-class classification. Future researchers could use different networks with other training options and different parameters.
    Tech Talks

  • How an end to end AI platform can help your business by providing a unified AI strategy.
    Tech Talks

  • The open data GDELT Project has analyzed more than half a billion news images and a decade of television news through Google’s Vision & Video APIs, updated each day. From visual search to visual misinformation research to planetary-scale visual analysis, summary airtime files to raw frame-level annotations, what does it look like to analyze the visual landscape of the news through the eyes of today’s cloud AI and how does one transform the resulting tens of terabytes of JSON annotations into actionable insights?
    Tech Talks

  • Embedded Spatial AI gives small, real-time systems the capability to parody human-level perception application-specific ways. This does not mean we've made a human. But rather, for specific perception and interaction tasks that could previously only be solved by a person - such as being able to quickly tell a good onion from a bad onion on a conveyor belt and throwing back in to the field - can now be performed by embedded systems. We are releasing the worlds first embedded spatial AI solution which allows this. Some ask: is this about replacing humans? No, it's largely not. In fact in almost all of the applications we've seen of this nascent tech, humans cannot be used because of size constraints (fitting a human into a 2x2" cube is hard!), thermal constraints, quantization constraints (taking detailed, quantitative notes like how ripe are the strawberries, and taking real-time action, etc.). And even in cases where humans can be used (e.g. the good-onion/bad-onion), there is a huge labor shortage as it's hot, dusty, miserable work - so currently the problem simply isn't solved. Every industry will be impacted.
    Tech Talks

  • Day 2

    August 14, 2020

  • Wearing masks is an important measure to slow down the spread of the coronavirus. In his talk, Vladimir will discuss different approaches to detecting people in or without masks in a large crowd of people.
    Tech Talks

  • Effect of COVID-19 pandemic is widespread across the globe and social distancing is a highly effective preventive measure to minimize virus spread. In this paper, we propose a computer vision-based system that can be used to automatically monitor social distancing compliance for single/multiple surveillance cameras in different environment settings like office, warehouse etc. Our system implementation follows a two-step process - (1) human detection and (2) check for social distancing. For human detection, we use faster RCNN object detector whereas, for social distancing compliance we calculate Euclidian distance between humans in bird eye view plane. Considering the nature of problem, we use high Recall object detector to minimize undetected humans and balance its natural corollary – i.e. low Precision, with a wrapper that filters out false positives (non-human objects). Our overall Precision is improved further by doing camera calibration and filtering short non-compliance events (t < 2 s). Finally, we discuss different deployment options for the tool to be effective.
    Tech Talks

  • Human race detection using face with deep- learning technique is active research area. It helps expand growing areas like Human-Computer-Interface, Understanding user demographics. It provides great insight in better understanding of demographics and diversity among population. Understanding ethnic diversification among user base can help many commercial applications to improve and optimize their products and services better suited for the community needs. Development around race detection is already an active area of research and improving its performance with speed is one of them. In this study we compared FaceNet architecture-based features extraction technique to detect the race and compared with plain CNN based classification techniques. Comparative results support the claim that race detection problem is better handled by such embedding based approach than plain image classification approach. Embedding based techniques also provide a competitive edge over other methods used in this comparative study.
    Paper Presentations

  • The digital world is moving from using 2D data to 3D data for highly accurate information along with realistic user experiences. The applications include frequent airborne surveys by collecting LIDAR data to detect vegetation growth and terrain changes due to weather or constructions for optimal planning and maintaining the power distribution lines or rail network; building highly accurate autonomous vehicles by analyzing 3D point cloud data to detect the objects on the pathways; measuring the dimensional changes of the large machinery equipment in mining industries for highly efficient operations. This talk discusses about the recent advances in 3D data processing, feature extraction methods, object type detection, object segmentation, and the object measurements in different body cross sections. This topic covers the 3D imagery concepts, the various algorithms for faster data processing on GPU environment, and the application of deep learning techniques for object detection and segmentation. The practical implementation of 3D data analysis using open source technologies using Open 3D, Point Net++, and other tools. The development of data science solution framework, detailed analysis of the advanced AI/ML algorithms and the technical challenges for different business problems would be showcased in detail.
    Tech Talks

  • The project is titled as “Detection of flood damaged areas”. Its purpose is to detect the areas which are affected and aren’t affected by floods using the satellite images. With help of several architectures of Convolution Neural Networks (CNN) . We have used DenseNet 121 without any feature extraction and attained a validation accuracy on the last epoch as 94.6% and the testing accuracy of 97.4% further we have used ORB which has test accuracy of 95.1% and edge detection which provides the test accuracy of 93.4%. All these architectures were performed on the same model.
    Paper Presentations

  • This presentation is about threat to current computer Vision models and how we can subsequently make AI more resilient to attack. By considering real-world scenarios where AI is exploited in our daily lives to process image. My presentation considers the motivations, feasibility, and risks posed by adversarial input/Perturbations. It provides intuitive explanations for the topic and explores how intelligent systems based on computer vision can be made more robust against adversarial input. AI not only competes with human capabilities in areas such as image, audio, and text processing, but often exceeds human accuracy and speed. While we celebrate advancements in AI, deep neural networks (DNNs)—the algorithms intrinsic to much of AI—have recently been proven to be at risk from attack through seemingly benign inputs. It is possible to fool DNNs by making subtle alterations to input data that often either remain undetected or are overlooked if presented to a human. For example, alterations to images that are so small as to remain unnoticed by humans can cause DNNs to misinterpret the image content. As many AI systems take their input from external sources—voice recognition devices or social media upload, for example—this ability to be tricked by adversarial input opens a new, often intriguing, security threat.
    Tech Talks

  • In the Internet-age, malware poses a serious and evolving threat to security, making the detection of malware of utmost concern. Many research efforts have been conducted on intelligent malware detection by applying data mining and machine learning techniques. In this project we considered a portable executable file as an image and used image classification technique to classify any given exe file into malware or benign- ware. We used different feature extraction techniques such as Edge detection, ORB, Log gabor and Gabor filter. We used a pre- trained densenet121 model and achieved a maximum accuracy of 94.04% using just an ORB filter.
    Paper Presentations

  • Many a times, when we create a computer vision model, we find ourselves interacting with a backbox and unaware of what feature extraction is happening at each layer. With Explainable AI this becomes easier to comprehend and know when enough layers have been added and what feature extraction has taken place at each layer. In this session, we will go through various libraries that make this possible.
    Tech Talks

  • With the rapid growth of E-commerce and increase in application of Artificial intelligence within the fashion and retail domain, demand for fashion Image datasets have been felt in the market. In recent years fashion datasets, used in the public domain are primarily Fashion-Ai[1], FashionGen [2], DeepFashion[3], DeepFashion2[4]. Among these datasets, DeepFashion2 is the most extensive dataset, with rich annotations and a large dataset collected partially from DeepFashion[4] and partially from the online fashion retail stores. This dataset contains more than 491,000 images consisting of 801,000 clothing items divided into 13 categories. The annotations for each clothing item in the training and validation set include bounding box points, landmark points, scale, occlusion, zoom-in, viewpoint and category name. Through our analysis, we have highlighted various errors in the DeepFashion2[4] dataset. Up until 2019 only half of the dataset was released, which contained a labelled dataset of only 191,000 images for training and 52,000 for validation. In the course of this analysis a random subset of data was evaluated. We manually checked 5,000 images and found 20% of them have annotation errors and hence have classified the errors in different categories. We have trained a SSD-Mobilenet and shown a gain in mAP (mean average precision) on cleaned dataset compared to original dataset.
    Paper Presentations

  • This talk discusses the numerous challenges associated with enabling autonomous driving perception using only cameras, and that too in highly unstructured environments, and how we developed real-time deep learning inference models to overcome such challenges.
    Tech Talks

  • Medical errors are the leading cause of mortality in the medical field and are a substantial contributor to the increased medical cost. Radiologists play an integral role in the interpretation and diagnosis of X- ray images. Diagnostic errors are bound to happen as they examine and interpret large numbers of X-rays. Generally over 40% of diagnostic errors lead to increase in medical costs, incorrect treatment or even death. Such diagnostic errors or “miss” in primary or critical findings, lead to incorrect diagnosis. One of the reasons could be due to lack of X- Ray clarity and visibility. Solutions to address such problems are to colorize the X- Ray and convert X-Ray image from 2D to 3D which would help the radiologist to interpret the X-Rays easily in less time due to improved clarity, visibility and due to almost real life like image.In this paper we have used the Computer Vision techniques for Colorization of X-Ray image and also to convert the 2D X-Ray image to 3D image. X- Ray colorization is done using Generator Adversarial Networks (GANS) specifically Pix2Pix GANS. GANS consists of the Generator part which generates the image and a Discriminator part which will differentiate between generated image and an original image to ensure conversion to realistic images. This combination of Generator and Discriminator networks will produce real and life like images. We have tried to implement this network combination to colorize the grayscale images so that Radiologists can easily detect any abnormality in the X-Ray with good precision. 2D to 3D Image reconstruction involves transforming a 2D X- Ray to a 3D X-Ray using Computer Vision. A depth map is generated for the 2D X-ray image which would give a depth axis or a third dimension to the otherwise 2D image using OpenCV library. Once the depth map is generated it would become easy to plot the 3D dimensional X-Ray images and view it from different angles so that the Radiologist’s don't miss out on abnormalities.
    Paper Presentations

  • Leveraging emerging technology such as Artificial Intelligence, Deep Learning, Autonomy and Unmanned Aerial Systems to create and deliver solutions to strengthen governance. Drone tech has immense power and is evolving quite rapidly. The day’s not far that we actually see flying cars. Research on autonomous flights has been long going on. Besides, there are various use cases of drones attainable with the help of Computer Vision. One of them being detecting anomalous behavior in a crowd. This has really helped during the situation of a pandemic we’re dealing with now. Using Bayesian Loss for Crowd Count Estimation with Point Supervision helps us to generate a density map showing humans. Drones can be used to deliver items from one place to another. Using Computer Vision techniques we can fly a drone and detect cracks in a building. Having a good camera mounted can even help us detect potholes on the road. Drone data can also be in the form of a tiff file. Suppose we have a multispectral orthophoto of a particular region captured by drone. We can calculate the NDVI value and compute the vegetation cover area in that region. Drones hold a huge game-changing role in future air mobility.
    Tech Talks

  • The COVID-19 pandemic forced governments across the world to impose lockdowns to prevent virus transmissions. This resulted in the shutdown of all economic activity and accordingly the production at manufacturing plants across most sectors was halted. While there is an urgency to resume production, there is an even greater need to ensure the safety of the workforce at the plant site. Reports indicate that maintaining social distancing and wearing face masks while at work clearly reduces the risk of transmission. We decided to use computer vision on CCTV feeds to monitor worker activity and detect violations which trigger real time voice alerts on the shop floor. This paper describes an efficient and economic approach of using AI to create a safe environment in a manufacturing setup. We demonstrate our approach to build a robust social distancing measurement algorithm using a mix of modern-day deep learning and classic projective geometry techniques. We have deployed our solution at manufacturing plants across the Aditya Birla Group (ABG). We have also described our face mask detection approach which provides a high accuracy across a range of customized masks.
    Paper Presentations

  • Almost all of us have used CamScanner (or a similar app) in our lives. It is a very effective app which allows users to scan documents from your mobile and share it as an image. The biggest advantage of the application is that it ‘cleans’ (denoising, rotation, sharpened, etc) up a camera-clicked image into a very refined output. But do you know that Computer Vision is the science behind it, and we can create our own CamScanner using the basics of OpenCV in Python. But before that, what is OpenCV ? Open Source Computer Vision Library is an open source computer vision which was built to provide a common infrastructure for computer vision applications. It was initially built on C++ language but works on other languages such as Python also. Learning Objectives Upon completion of this workshop, the audience will be able to: Use OpenCV to understand different transformation functions (blurring, thresholding, canny edge detection, etc) Create a functional CamScanner Bonus – How to use Pytesseract (OCR) to extract data from images (if present) Prerequisites You should be familiar with: Python development experience Basic of OpenCV (preferred, not mandatory)
    Tech Talks

  • In this paper, we have described the approach we used to build an end to end system for text and image query on image data sets, we also call it TIQS (Text & Image Query System). The system retrieves relevant images from a (given annotated) data set based on a sample image or a text query. We have used images from two data sets. 1. Visual Genome dataset from Stanford [1][2] and 2. ADE20K data set from CSAIL MIT [3][4][5]. The Visual Genome data set is fully annotated and can be directly used for text queries. To respond to image-based queries, the images are segmented and annotated (into 150 categories) using PSPNET[8]. When a sample image is presented to retrieve similar images, the sample image too is segmented and annotated using PSPNET. A novel similarity score is defined between the sample image and images in the database, and the images with high similarity scores are presented to the user.
    Paper Presentations

  • Abstract—Solar data in the form of images and Carrington maps are very important resources for the study of the long- term variations of the sun. These data can help us study the solar activity features such as filaments and other prominences. Solar filaments have been long related to the Coronal Mass Ejections (CMEs). CMEs are major solar eruptions that can cause changes in the solar atmosphere and bring about geomagnetic storms on Earth. S-shaped filaments are popularly termed as Sigmoidal filaments. These structures may soon become unstable and give out large scale CMEs. Thus the identification of Sigmoidal filaments in a given solar image may help us study the relation between them and CMEs better. Work has been done previously for the identification of filaments from solar images and Carrington Maps. This pa- per proposes a fully automated algorithm to detect sigmoidal filaments from Carrington maps without any set parameteric constants that may work in real-time. This may further play an important role in the prediction of CMEs/Flares.
    Paper Presentations

  • Machine learning was anyways expanding its footprint in our daily lives. The recent global pandemic has forced us to pause and rethink. And now we are in the urgent need to press the pedal of innovation, in every walk to life. Arguably, at the forefront of this endeavor has to be our ability to expand the horizons of how we think about data. Never before has ML been as important in history as today and never before has an ask been of such unprecedented scale and ways. Be it battle for living room entertainment, mitigating risks in business and supply chain models, accessing real-time data interventions, expediting data intensive medical research or even distancing ourselves from fellow beings, it’ll be the eye catcher data science that we’ll look upto to serve a quick recipe. Narrowing the purview of the talk to computer vision, this talk, will while focusing on the examples of consumer video entertainment sector will also delve upon everything that’s right there in the playbook of video intelligence and its far reaching empowerment capabilities. Using a concise framework, we will analyze the world of ML computer vision models and talk about the wise, the ugly and the extreme ones, their ability to tackle business challenges and use case economics.
    Tech Talks

  • Deep Neural networks are among the most powerful machine learning techniques that are becoming very interesting for Big Data applications. In the context of embedded platforms, implementing efficient DNNs in terms of performance and energy consumption while they maintain a required quality is very challenging. Sparsity can be used as an intelligent technique for reducing the size of DNNs. The purpose of this research is to explore the possibilities of introducing sparsity to CNNs and to evaluate their performance.
    Tech Talks

  • Within Deloitte Consulting we have built a community of data scientists and ML engineers, and formalized it as an internal R&D arm. These teams cover the gamut of ML/AI projects, including core ML, reinforcement learning, NLP, complex optimization, and the one we will highlight: Edge and Computer Vision, which also includes sensor modeling, smart factories, and robotics. Within this R&D group, we will talk about a select handful of projects, including drone-based visual equipment inspection, activity recognition/pose estimation for clinical use cases, emotion detection for user experiences, visual question answering for the visually impaired, our CV-driven signal processing research, and CV-adjacent work like reinforcement learning for autonomous vehicles. We will discuss how we selected the use cases in response to market demand, how we formulated complex and custom ML solutions, how we work with business leaders to package these for client pitches, and how our R&D efforts directly benefit our work with clients. We will also cover some technical details and practical use cases behind these projects and how we reached the solutions. We’ll also touch upon how we use these expert teams to expose new learners to world-class data scientists, creating an ecosystem for fast-paced AI R&D within a large enterprise.
    Tech Talks

Full Day Computer Vision Workshop

first day of the conference will feature a full-day hands-on workshop track on Computer Vision. The workshop will provide an overview of computer vision as a broad topic for participants to get started. A certificate of participation will be provided to all attendees of workshop.

More Details

Extraordinary Speakers

Meet top developers, innovators & researchers in the space of computer vision.

  • Early Bird Pass

    Available till 10th July
  • Access to all tracks & workshops
  • Access the recorded sessions later
  • Certificate of attendance provided
  • Access to online networking with attendees & speakers
  • Group discount available
  • $25
  • Regular Pass

    AVAILABLE FROM 10TH JULY TO 31st July
  • Access to all tracks & workshops
  • Access the recorded sessions later
  • Certificate of attendance provided
  • Access to online networking with attendees & speakers
  • Group discount available
  • $50