Dr Dan Feldman: Solving 'Big Data' Problems using Coresets


Main Researcher: Dr. Dan Feldman

Research area: Robotics and Big Data 

In a world where technological progress generates massive amounts of digital data, leaner, faster and more affordable solutions are needed to   help  sort and analyze information in real-time. Dr. Dan Feldman and his research students are using coresets (data reduction algorithms) to support better business intelligence and optimize the performance   of simple robots for improved customer service and increased cost-savings.

‘Big Data’ describes the volumes of data sets streaming from all aspects of our lives in unprecedented amounts, collected from posts on social networks, readings from sensory technologies, digital pictures and videos, to GPS signals from mobile phones and other data sources estimated at 2,500,000,000,000,000,000 a day! Commonly used software tools are incapable of processing all this information in real-time.

New Invention:
Dr. Feldman, Assistant Professor at the Computer Science Department, is introducing a new approach to solving ‘big data’ problems plaguing the IT industry. “I hope to bridge the gap between mathematical theory and engineering applications,” explains Feldman.
“Coresets are a new paradigm that can help us process more accurate results of bigger and more complex datasets faster than ever. Unlike compression technique (like ZIP or MP4), coresets is a problem dependent data reduction technique that allows us to solve the problem faster by order of magnitude. With smaller datasets, running times are improved while only marginally compromising the original data.”
Feldman and his students are using coresets (as a statistical computation tool) to solve fundamental problems emerging in Big Data that affect machine learning performance through robotic projects at the Robotics and Big Data (RBD) Lab. One of these projects involves autonomous navigation of ordinary toy drones using low-cost (but very safe) ‘simple’ hardware with strong novel algorithms .
Other projects include the development of gesture control armbands – the future of wearable technology and human-computer interaction that lets you control technology (like your phone, computer, and even industrial machines) hands-free using only gestures and motion.
The group has also recently begun to apply coresets in determining differential privacy in solving cloud computation security challenges. “The idea is to extract statistical
data from large datasets while preserving the privacy and anonymity of its users – we refer to it as private coresets or sanitized database,” relates Feldman.
According to Feldman, “The main breakthrough in this field is that we now have a general framework coreset for any number of processing challenges. My goal is to bring this to the attention of engineers, analysts and data scientists so that they may apply them to their research or in solving complex business problems.”

Potential usages:
At the Robotics and Big Data (RBD) Laboratory, a team of Computer Science students are developing inexpensive real-time tracking systems that will turn ordinary toy drones into autonomous drones, capable of navigating through complex grounds and buildings.
Soliman Nasser (PhD student) and his lab partner Ibrahim Jubran (MSc student) are developing state-of the-art algorithms based on coresets that will be able to track and localize flying robots. With the help of research assistant Michael Volgin and BSc student George Kesaev, the team succeeded in creating a low-cost tracking system to ultimately replace commercial systems that are up to a hundred times more expensive.
The main challenge is stabilizing the drones using large amounts of data collected from cheap sensors such as 3D cameras, electroencephalography (EEGs) and inertial-measurement unit (IMUs). This is achieved by calibrating the movement of the drone to a carefully chosen set of points or markers (such as visual features or even stars in the sky) collected by sensors within a virtual world. Defining the automatic selection of these ‘smart’ set points is a key challenge. Once this application runs smoothly it can be adapted to indoor or outdoor environments. The team is also working on the development of a low-cost docking surface for the drone to be able to recharge easily and swiftly, as its flight time is considerably short at 7-8 minutes.

Dr. Dan Feldman is a leading expert in the field of scalable data reduction. He arrived to the University in 2014 to set up the Robotics and Big Data (RBD) Laboratory, after completing a post-doctoral fellowship at MIT and Caltech. The Lab serves a large number of undergraduate and graduate students. The RBD Lab also attracts international students from leading universities to conduct their research under the guidance of Dr. Feldman.

feldman rbd lab

Dr. Dan Feldman and his research team at the Robotics and Big Data Lab. Jan 2017