Compatible & Open
Seamlessly migrates offline Spark applications to the cloud based on the open-source Apache Spark ecosystem and APIs, reducing your migration workload.
Powerful Computing Power
Adopts the high-scalability big data architecture to process data at the TB-EB scale, allowing you to handle data analysis requests in various scenarios at ease.
Uses the in-memory computing model, DAG scheduling framework, and efficient optimizer to deliver the comprehensive performance 100 times over that of the traditional MapReduce model.
Bills you based on the usage time. The pricing unit of DLI is compute unit (CU). A CU contains four cores and 16 GB memory. DLI bills you $0.228 USD per CU per hour.
You can connect to DLI using JDBC or SDK, and DLI complies with ANSI SQL 2003. With DLI, you can perform analysis based on massive volume of data, instead of taking care of the deployment and O&M of SQL engines.
DLI offers full-stack Spark capabilities, such as Spark SQL, Spark Streaming, and Spark Batch based on the Apache Spark ecosystem, and helps you analyze data at the TB-EB scale with standard SQL or Spark APIs.
Computing resources are isolated between tenants to meet job SLAs. Your data rights can be restricted to a specific table or column for data sharing between departments and rights management.
DLI integrates the capabilities of processing and analyzing images, videos, and languages in SQL to offer convergent analysis for structured and unstructured data.
DLI can work with multiple data formats, such as CSV, JSON, Parquet, ORC, and CarbonData, and supports federated analysis on data from multiple various cloud services (for example, OBS, DWS, CloudTable, and RDS) with data migration, helping you quickly fulfill business innovations and get valuable insights from data.
Auto scaling of storage and computing resources allows you to query data without worrying about whether you have sufficient resources.
Game Operation Data Analysis
Different departments of a game company analyze daily new logs via the game data analysis platform to obtain required metrics and make decisions according to the obtained metric data. For example, the operation department obtains required metric data, such as new players, active players, retention rate, churn rate, and payment rate, through the platform to learn the current game status and determine follow-up actions. The placement department obtains the channel sources of new players and active players through the platform to determine the platforms for placement in the next cycle.
DLI uses Spark Streaming to directly ingest data from DIS and perform preprocessing such as data cleaning. You only need to edit the processing logic, without the need to pay attention to the multi-thread model.
You can use standard SQL statements to compile metric analysis logic without paying attention to the complex distributed computing platform.
Log analysis is scheduled periodically based on the time requirements. There is a long idle period between each two scheduling operations. DLI adopts the pay-per-use billing mode, which saves the cost by more than 50% compared with the exclusive cluster mode. DLI only bills you for the resources used for scheduling.
Digital Service Transformation for Car Company
In the face of new competition pressures and changes in travel services, car companies build the IoV cloud platform and IVI OS to streamline Internet applications and vehicle use scenarios, completing digital service transformation for car companies. This delivers better travel experience for vehicle owners, increases the competitiveness of car companies, and promotes sales growth. For example, collect and analyze daily vehicle metric data (such as batteries, engines, tire pressure, and airbags), and give feedback on maintenance suggestions to vehicle owners in time.
RDS stores the basic information about vehicles and vehicle owners, CloudTable stores real-time vehicle location and health status information, and DWS stores periodic metric statistics. DLI allows federated analysis on data from multiple sources without data migration.
Car companies need to retain all historical data to support auditing and other services that require infrequent data access. Warm and cold data is stored in OBS and frequently accessed data is stored in CloudTable and DWS, reducing the overall storage cost.
There is no special requirement for the CPU, memory, hard disk space, and bandwidth.
Geographic Big Data Analysis
Geographic big data has big data characteristics. It features large data volume (for example, PB-scale global satellite remote sensing image data is generated) and numerous data varieties (for example, structured remote sensing image raster data, vector data, unstructured spatial location data, and 3D modeling data). Users focus on how to use efficient mining tools or mining methods to get insights from the large volume of geographic big data.
DLI supports full-stack Spark capabilities and provides rich Spark spatial data analysis algorithm operators. It delivers full support of offline batch processing on massive volumes of data, such as structured remote sensing image data, unstructured 3D modeling, laser cloud data, and real-time computing on dynamic streaming data with location attributes.
DLI allows you to quickly migrate remote sensing image data at the TB or EB scale and perform image data slicing to offer resilient distributed datasets (RDDs) for distributed batch computing.
Support of querying CloudTable data
Commercial use of the service and renamed as DLI
Support of the fully-managed Spark job
Interactive multi-job editor
Support of the gene sequencing job
Support of geographical geospatial query
SQL support of complex data types
Support of federated analysis of heterogeneous data sources
DLI is compatible with Spark 2.3.2
Heterogeneous data source CSS is supported in federated analysis
Heterogeneous data source GeoMesa will be supported in federated analysis