Spark Summit
Spark Summit
  • Бейне 60
  • Рет қаралды 3 786 955

Бейне

Women in Big Data Lunch at Spark Summit East
Рет қаралды 1,7 М.5 жыл бұрын
Panel Discussion with Ziya Ma, Nick Dimtchev, Julie Greenway and Gunjan Sharma moderated by Donna Fernandez
The Leaky Pipeline Problem: Making your Mark as a Woman in Big Data:by Kavitha Mariappan
Рет қаралды 2,3 М.5 жыл бұрын
Women in Big Data Keynote at Spark Summit East
Apache Spark Meet Up at Spark Summit East 2017
Рет қаралды 4,1 М.5 жыл бұрын
Apache Spark Meet Up at Spark Summit East 2017
Using Spark and Riak for IoT Apps-Patterns and Anti Patterns: Spark Summit East talk by Pavel Hardak
Рет қаралды 2,1 М.5 жыл бұрын
Everybody agrees that IoT is changing the world… and creates new challenges for software developers, architects and DevOps. How can we build efficient and highly scalable distributed applications using open-source technologies? What are characteristics of data generated by IoT devices and how it differs from traditional enterprise or Big Data problems? Which architectural patterns are beneficia...
Effective Spark with Alluxio: Spark Summit East talk by Gene Pang and Haoyuan Li
Рет қаралды 3,1 М.5 жыл бұрын
Alluxio, formerly Tachyon, is a memory speed virtual distributed storage system and leverages memory for storing data and accelerating access to data in different storage systems.. Alluxio has a quickly growing open source community of developers and users and is deployed at such organizations as Alibaba, Baidu, Barclays, Intel, Huawei, and Qunar. Many of these deployments use Alluxio with Spar...
Utilizing Spark as the Analytical Core to an Open Source HTAP Relational Database: John Leach
Рет қаралды 1,5 М.5 жыл бұрын
Utilizing Spark as the Analytical Core to an Open Source HTAP Relational Database on HBase Splice Machine utilizes Spark on Yarn as the analytical execution architecture for our open source HTAP relational database. This talk will walk through how a dual engine architecture can exist where Spark supports analytical queries and large database maintenance operations (HBase Compactions, Index Main...
Building Real Time BI Systems with Kafka, Spark & Kudu: Spark Summit East talk by Ruhollah Farchtchi
Рет қаралды 13 М.5 жыл бұрын
One of the key challenges in working with real-time and streaming data is that the data format for capturing data is not necessarily the optimal format for ad hoc analytic queries. For example, Avro is a convenient and popular serialization service that is great for initially bringing data into HDFS. Avro has native integration with Flume and other tools that make it a good choice for landing d...
The Fast Path to Building Operational Applications with Spark: talk by Nikita Shamgunov
Рет қаралды 1,9 М.5 жыл бұрын
The Fast Path to Building Operational Applications with Spark: talk by Nikita Shamgunov
Kerberizing Spark: Spark Summit East talk by Abel Rincon and Jorge Lopez-Malla
Рет қаралды 1,1 М.5 жыл бұрын
Spark had been elected, deservedly, as the main massive parallel processing framework, and HDFS is the one of the most popular Big Data storage technologies. Therefore its combination is one of the most usual Big Data’s use cases. But, what happens with the security? Can these two technologies coexist in a secure environment? Furthermore, with the proliferation of BI technologies adapted to Big...
Optimizing Spark Deployments for Containers: Isolation, Safety & Performance by William Benton
Рет қаралды 3,5 М.5 жыл бұрын
Developers love Linux containers, which neatly package up an application and its dependencies and are easy to create and share. However, this unbeatable developer experience hides some deployment challenges for real applications: how do you wire together pieces of a multi-container application? Where do you store your persistent data if your containers are ephemeral? Do containers really contai...
Auto Scaling Systems With Elastic Spark Streaming: Spark Summit East talk by PhuDuc Nguyen
Рет қаралды 2,3 М.5 жыл бұрын
Come explore a feature we’ve created that is not supported out-of-the-box: the ability to add or remove nodes to always-on real time Spark Streaming jobs. Elastic Spark Streaming jobs can automatically adjust to the demands of traffic or volume. Using a set of configurable utility classes, these jobs scale down when lulls are detected and scale up when load is too high. We process multiple TB’s...
Secured Kerberos based Spark Notebook for Data Science: Spark Summit East talk by Joy Chakraborty
Рет қаралды 3,2 М.5 жыл бұрын
This presentation will provide technical design and development insights in order to set up a Kerberosied (secured) JupyterHub notebook using Spark. Joy will show how Bloomberg set up the Kerberos-based Spark-notebook-integrating JupyterHub, Sparkmagic, and Levy. Sparkmagic provides the Spark kernel for Scala and Python. Livy is one of the most promising open source software to allow to submit ...
Apache Toree: A Jupyter Kernel for Spark: Spark Summit East talk by Marius van Niekerk
Рет қаралды 4,5 М.5 жыл бұрын
Apache Toree: A Jupyter Kernel for Spark: Spark Summit East talk by Marius van Niekerk
Teaching Apache Spark Clusters to Manage Their Workers Elastically: Erik Erlandson and Trevor Mckay
Рет қаралды 1,1 М.5 жыл бұрын
Devops engineers have applied a great deal of creativity and energy to invent tools that automate infrastructure management, in the service of deploying capable and functional applications. For data-driven applications running on Apache Spark, the details of instantiating and managing the backing Spark cluster can be a distraction from focusing on the application logic. In the spirit of devops,...
Building a Dataset Search Engine with Spark & Elasticsearch: talk by Oscar Castañeda-Villagrán
Рет қаралды 11 М.5 жыл бұрын
Building a Dataset Search Engine with Spark & Elasticsearch: talk by Oscar Castañeda-Villagrán
Building Realtime Data Pipelines with Kafka Connect & Spark Streaming by Ewen Cheslack-Postava
Рет қаралды 12 М.5 жыл бұрын
Building Realtime Data Pipelines with Kafka Connect & Spark Streaming by Ewen Cheslack-Postava
Apache Carbondata: An Indexed Columnar File Format for Interactive Query by Jacky Li/Jihong Ma
Рет қаралды 3,8 М.5 жыл бұрын
Apache Carbondata: An Indexed Columnar File Format for Interactive Query by Jacky Li/Jihong Ma
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by Tom Phelan
Рет қаралды 3,8 М.5 жыл бұрын
Lessons Learned from Dockerizing Spark Workloads: Spark Summit East talk by Tom Phelan
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
Рет қаралды 6 М.5 жыл бұрын
Time Series Analytics with Spark: Spark Summit East talk by Simon Ouellette
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by David Palaitis
Рет қаралды 6 М.5 жыл бұрын
New Directions in pySpark for Time Series Analysis: Spark Summit East talk by David Palaitis
Keeping Spark on Track: Productionizing Spark for ETL: talk by Kyle Pistor and Miklos Christine
Рет қаралды 9 М.5 жыл бұрын
Keeping Spark on Track: Productionizing Spark for ETL: talk by Kyle Pistor and Miklos Christine
Spark: Data Science as a Service: Spark Summit East talk by Shekhar Agrawal and Sridhar Alla
Рет қаралды 2,1 М.5 жыл бұрын
Spark: Data Science as a Service: Spark Summit East talk by Shekhar Agrawal and Sridhar Alla
Accelerating Spark Genome Sequencing in Cloud-A Data Driven Approach by Eric Kaczmarek and Lucy Lu
Рет қаралды 4495 жыл бұрын
Accelerating Spark Genome Sequencing in Cloud-A Data Driven Approach by Eric Kaczmarek and Lucy Lu
Spark Autotuning: Spark Summit East talk by: Lawrence Spracklen
Рет қаралды 2,6 М.5 жыл бұрын
Spark Autotuning: Spark Summit East talk by: Lawrence Spracklen
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East talk by Jose Soltren
Рет қаралды 4,2 М.5 жыл бұрын
Fault Tolerance in Spark: Lessons Learned from Production: Spark Summit East talk by Jose Soltren
Lambda Processing for Near Real Time Search Indexing at WalmartLabs: talk by Snehal Nagmote
Рет қаралды 3,9 М.5 жыл бұрын
Lambda Processing for Near Real Time Search Indexing at WalmartLabs: talk by Snehal Nagmote
Sparking Up Data Engineering: Spark Summit East talk by Rohan Sharma
Рет қаралды 1,6 М.5 жыл бұрын
Sparking Up Data Engineering: Spark Summit East talk by Rohan Sharma
Spark Streaming as a Service with Kafka and YARN: Spark Summit East talk by Jim Dowling
Рет қаралды 2,5 М.5 жыл бұрын
Spark Streaming as a Service with Kafka and YARN: Spark Summit East talk by Jim Dowling
Learnings Using Spark Streaming and DataFrames for Walmart Search: by Nirmal Sharma and Yan Zheng
Рет қаралды 2,4 М.5 жыл бұрын
Learnings Using Spark Streaming and DataFrames for Walmart Search: by Nirmal Sharma and Yan Zheng
Advanced Apache Spark Training - Sameer Farooqui (Databricks)
Рет қаралды 309 М.7 жыл бұрын
Advanced Apache Spark Training - Sameer Farooqui (Databricks)
A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)
Рет қаралды 132 М.7 жыл бұрын
A Deeper Understanding of Spark Internals - Aaron Davidson (Databricks)
First Steps With Spark - Spark Screencast #1
Рет қаралды 94 М.8 жыл бұрын
First Steps With Spark - Spark Screencast #1
Get Rid of Traditional ETL, Move to Spark! (Bas Geerdink)
Рет қаралды 93 М.5 жыл бұрын
Get Rid of Traditional ETL, Move to Spark! (Bas Geerdink)
Top 5 Mistakes When Writing Spark Applications
Рет қаралды 92 М.6 жыл бұрын
Top 5 Mistakes When Writing Spark Applications
Optimizing Apache Spark SQL Joins: Spark Summit East talk by Vida Ha
Рет қаралды 74 М.5 жыл бұрын
Optimizing Apache Spark SQL Joins: Spark Summit East talk by Vida Ha
TRAINING Intro to Apache Spark - Brian Clapper (Independent Consultant)
Рет қаралды 72 М.6 жыл бұрын
TRAINING Intro to Apache Spark - Brian Clapper (Independent Consultant)
Spark + Parquet In Depth: Spark Summit East talk by: Emily Curtin and Robbie Strickland
Рет қаралды 62 М.5 жыл бұрын
Spark Parquet In Depth: Spark Summit East talk by: Emily Curtin and Robbie Strickland
Data Science Lifecycle with Zeppelin and Spark
Рет қаралды 58 М.6 жыл бұрын
Data Science Lifecycle with Zeppelin and Spark
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Michael Armbrust
Рет қаралды 58 М.5 жыл бұрын
Structuring Apache Spark 2.0: SQL, DataFrames, Datasets And Streaming - by Michael Armbrust
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Рет қаралды 49 М.6 жыл бұрын
Building Realtime Data Pipelines with Kafka Connect and Spark Streaming
Deep Dive: Apache Spark Memory Management
Рет қаралды 49 М.5 жыл бұрын
Deep Dive: Apache Spark Memory Management
Livy: A REST Web Service For Apache Spark
Рет қаралды 36 М.5 жыл бұрын
Livy: A REST Web Service For Apache Spark
Mastering Spark Unit Testing (Ted Malaska)
Рет қаралды 35 М.5 жыл бұрын
Mastering Spark Unit Testing (Ted Malaska)
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Рет қаралды 33 М.5 жыл бұрын
Apache Spark 2.0: A Deep Dive Into Structured Streaming - by Tathagata Das
Using Spark and Elasticsearch for Real-time Data Analysis- Costin Leau (Elasticsearch)
Рет қаралды 33 М.7 жыл бұрын
Using Spark and Elasticsearch for Real-time Data Analysis- Costin Leau (Elasticsearch)
Spark and Spark Streaming at Uber - Meetup talk with Tathagata Das
Рет қаралды 30 М.6 жыл бұрын
Spark and Spark Streaming at Uber - Meetup talk with Tathagata Das
Relationship Extraction from Unstructured Text Based on Stanford NLP with Spark
Рет қаралды 30 М.6 жыл бұрын
Relationship Extraction from Unstructured Text Based on Stanford NLP with Spark
Deep Dive into Monitoring Spark Applications Using Web UI and SparkListeners (Jacek Laskowski)
Рет қаралды 30 М.5 жыл бұрын
Deep Dive into Monitoring Spark Applications Using Web UI and SparkListeners (Jacek Laskowski)
Best Practices for running PySpark
Рет қаралды 29 М.6 жыл бұрын
Best Practices for running PySpark
Spark on YARN: a Deep Dive - Sandy Ryza (Cloudera)
Рет қаралды 29 М.7 жыл бұрын
Spark on YARN: a Deep Dive - Sandy Ryza (Cloudera)
R & Spark: How to Analyze Data Using RStudio's Sparklyr: by Nathan Stephens
Рет қаралды 28 М.5 жыл бұрын
R & Spark: How to Analyze Data Using RStudio's Sparklyr: by Nathan Stephens
Top 5 Mistakes When Writing Spark Applications
Рет қаралды 26 М.5 жыл бұрын
Top 5 Mistakes When Writing Spark Applications
Spark Documentation Overview - Spark Screencast #2
Рет қаралды 26 М.8 жыл бұрын
Spark Documentation Overview - Spark Screencast #2
SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal
Рет қаралды 26 М.5 жыл бұрын
SparkSQL: A Compiler from Queries to RDDs: Spark Summit East talk by Sameer Agarwal
Real-time big data processing with Spark Streaming- Tathagata Das (Databricks)
Рет қаралды 25 М.8 жыл бұрын
Real-time big data processing with Spark Streaming- Tathagata Das (Databricks)
Enabling Exploratory Analysis of Large Data with R and Spark
Рет қаралды 24 М.6 жыл бұрын
Enabling Exploratory Analysis of Large Data with R and Spark
Transformations and Caching - Spark Screencast #3
Рет қаралды 23 М.8 жыл бұрын
Transformations and Caching - Spark Screencast #3
A Standalone Job in Scala - Spark Screencast #4
Рет қаралды 22 М.8 жыл бұрын
A Standalone Job in Scala - Spark Screencast #4
Lambda Architecture, Analytics and Data Pathways with Spark Streaming, Kafka, Akka and Cassandra
Рет қаралды 22 М.6 жыл бұрын
Lambda Architecture, Analytics and Data Pathways with Spark Streaming, Kafka, Akka and Cassandra

Пікірлер

  • Houston Vanhoy
    Houston Vanhoy 10 күн бұрын

    This presentation is not about fraud prevention. As an audience member said, the title is a fraud. 👎

  • Murali Kadambi
    Murali Kadambi 22 күн бұрын

    audio level is very low

  • Amit Bhattacharyya
    Amit Bhattacharyya Ай бұрын

    good explanations , this would be great if some git code they can mention

  • omkar
    omkar Ай бұрын

    debugged a problem by following steps as you explained. Thank you very much. (2022 april)

  • Tomracc
    Tomracc Ай бұрын

    this is wonderful, enjoyed start to end :)

  • NAUSHAD AHAMAD
    NAUSHAD AHAMAD Ай бұрын

    Great talk sir, you have cleared topics

  • svdfxd
    svdfxd Ай бұрын

    One of the best videos to learn about Spark Structured Streaming. Watched this once, back in the end of 2020...still relevant.

  • Nainul ARAB S M
    Nainul ARAB S M 2 ай бұрын

    Could you please demonstrate how to retain decimal value when we write the data frame in json format. Eg : one of my column in df has a value of 12.00 when I write this df into a json file df.write.json(“user/my path/“) . The json file written in this path will have “column”:12.0 instead of 12.00.

  • justsurviving forbutterchicken

    Is there a code available for this ??

  • Hasan Al-Ammori
    Hasan Al-Ammori 3 ай бұрын

    Fantastic talk! I wish there was a little more info on the format spec itself.

  • S R
    S R 3 ай бұрын

    python api is shit as fuck.

  • Mick Dreeling
    Mick Dreeling 3 ай бұрын

    This guy at the beginning. Find a seat man. Ruined it for me.

  • Alex Xela
    Alex Xela 3 ай бұрын

    Great

  • Ramses Alexander Coraspe Valdez

    code?

  • theodopoulos
    theodopoulos 4 ай бұрын

    Weird my comment vanished. I noticed that pivot is now supported in SQL Syntax. However, you need to define the list of columns to pivot within the IN as an expression_list. How would you define this list if it was dynamic?

  • SonyX64
    SonyX64 4 ай бұрын

    Whenever this bit manipulation comes, I feel "this is not my thing" 😝

  • Jahid Hasan
    Jahid Hasan 4 ай бұрын

    Does anyone have resources or source code for a deep learning based RLScheduler in a single node level task scheduling

  • Taylor Perkins
    Taylor Perkins 6 ай бұрын

    Thank you so much! I have inherited a scala spark streaming code base and have been in the process of learning all that it does. This code base does not have any actual unit testing, and this talk has given me the confidence and encouragement to be able to write unit tests to make it stronger and to better understand what it does. Figuring out how to test spark outside of a production environment has been a big challenge. This has definitely helped a lot

  • Rajeev Das
    Rajeev Das 7 ай бұрын

    When you said that if you borrow from 20 it becomes 19 but I think that the 2 becomes a 1 since there is 0 in the middle

  • Nihad TP
    Nihad TP 7 ай бұрын

    1:26:26 Corona.

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Pattern recogition?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    yes?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Is this distortion just in my end?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Naturally inerit means?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Or it depends on the 3 V's?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    IS the current bersion outdated?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Is this a network lag?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Hello, I am not able to hear this very clearly?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    I mean distortion

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Is this an audilbile?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    WAhta?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Konnichiwa

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    or iT's just me?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Is it mic issue with everyine

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Could you share the link of the book here?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    FACE PALM?! WAS I that out of sync?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Ummmmmm I WANNA freee book T_T

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    umm..... Spark....?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    To be fair I've used

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    SO is youtuve>? THe same as youtuve???

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    DEmonized?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    I'm sorry am I right

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    Depends on dependencu?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    so did u get a date?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    .exe? SAME as target??

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    One sec

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    2.+ ?

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    2.0.*

  • Megzie Ssyk
    Megzie Ssyk 7 ай бұрын

    REdundency?