Open Data Studio¶
Open data studio is a fully managed computing service on Staroid cloud, built with open source development model.
That means you can enjoy all the benefits of software as a service, without giving up ability to understand the code, contribute and improve like any other open source software.
From Python shell/ide/notebook on your laptop, interactively process massive data on your data lake with Spark serverless.
Connect your BI tools via JDBC using Spark thriftserver. On-demand Spark cluster is automatically configured for you.
Visualize your data on interactive notebook using Apache Zeppelin. On-demand Spark cluster is automatically configured for you.
Use all the latest machine learning technology in a single place. Open data studio continues to integrate the best technologies for machine learning.
Easy of use¶
Access to the latest machine learning technology shouldn’t be more than a few clicks or a few lines of code away.
# import open data studio library import ods # create a spark cluster on the cloud with 3 initial workers spark = ods.spark("my-spark", worker_num=3).session() # run spark task df = spark.read.load("...")
$ # install ray and staroid package $ pip install ray staroid kubernetes $ # switch to nightly build $ ray install-nightly $ # get autoscaler yaml files $ git clone https://github.com/ray-project/ray.git $ # spin-up cluster on the cloud and attach $ ray up ray/python/ray/autoscaler/staroid/example-full.yaml $ ray attach ray/python/ray/autoscaler/staroid/example-full.yaml
Save time and reduce risk. Open data studio is maintained by the committers of the open source project and industry experts on top of secure, reliable, and high performance cloud platform Staroid.
Open data studio is an open source project. You can easily see source code, understand how it works, and get involved. When you need, fork and get your own version of managed service!
Also, every time you launch projects, developers of the projects get funded via StarRank.