Robotic process automation company DataRobot today announced the latest release of the DataRobot Enterprise AI Platform, which includes the changes for automated feature engineering and the debut of AI Catalog, a service that lets enterprise teams search customer data in order to build and deploy AI models or make predictions.
AI Catalog utilizes tech from Cursor, a company acquired in February by DataRobot for an undisclosed amount. Cursor was born out of experiences at LinkedIn where Cursor cofounder and DataRobot VP of product management Adam Weinstein led an analytics team. Then the challenge was helping data scientists and engineers find scattered data in a large organization, then help them understand how to make data searchable and sharable to enable team collaboration.
AI Catalog will seek to achieve the same goals.
The update will also include automated feature engineering, which creates features that enable the enhancement of RPA by sharing related or secondary datasets. Automatic feature generation will grow more powerful with the growth of AI Catalog, Weinstein told VentureBeat.
“We actually don’t want you to even have to do that in the long run, I think there’s sort of this like chicken and egg problem of once users start using the catalog, and the data is populated there, we can actually look to that catalog, automatically identify those data sets, and do the whole thing without any user assistance,” he said.
Automated feature engineering can help enforce governance within organizations to implement common standards and definitions like, for example, ensure a common definition of customer churn.
Data scientists will also be able to use the Enterprise AI Platform to use the Apache Spark SQL to combine multiple datasets from Hadoop, disparate text, or other sources in an AI Catalog.
“I can actually combine all those within datarobot without leaving the platform using with Spark SQL and then we’ll emit a new dataset that that transform data set that you can then use for projects to create new models or create predictions,” Weinstein said.
The platform update also include MLOps, a service introduced last month that takes existing DataRobot services for AI and combines them with tools from machine learning operations company ParallelM, which was acquired by DataRobot in June. The service operates with Apache Spark and Kubernetes and comes with tools designed to help organizations deploy models in production such as a dashboard for automatically identifying systems that should be retrained to improve performance.
Despite heavy investment in AI talent and insistence by some that we now live in an AI world, many businesses still struggle to deploy AI in production. According to a November 2018 PricewaterhouseCoopers survey, 4% of business executives reported challenges deploying AI systems.
Earlier this month, DataRobot raised a $206 million funding round, bringing its total funding raised to more than $400 million.