Making Spark Accessible: My Databricks Summer Internship
databricks
SEPTEMBER 26, 2023
My summer internship on the PySpark team was a whirlwind of exciting events. The PySpark team develops the Python APIs of the open.
This site uses cookies to improve your experience. To help us insure we adhere to various privacy regulations, please select your country/region of residence. If you do not select a country, we will assume you are from the United States. Select your Cookie Settings or view our Privacy Policy and Terms of Use.
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
Used for the proper function of the website
Used for monitoring website traffic and interactions
Cookies and similar technologies are used on this website for proper function of the website, for tracking performance analytics and for marketing purposes. We and some of our third-party providers may use cookie data for various purposes. Please review the cookie settings below and choose your preference.
databricks
SEPTEMBER 26, 2023
My summer internship on the PySpark team was a whirlwind of exciting events. The PySpark team develops the Python APIs of the open.
The Pragmatic Engineer
OCTOBER 18, 2024
And yet, substitute Apple with Automattic, App Store with WordPress.org and Spotify with one of the most popular WordPress plugins: and Automattic’s CEO is accused of orchestrating events similar to above. This event is shameful and unprecedented in the history of open source on the web. Open source theft? Source: X What next?
Snowflake
MAY 16, 2024
The event-driven architecture converts events to Snowflake’s relational tables, enabling rapid, accurate, and secure data delivery for the most crucial government programs — ultimately benefiting more people, more smoothly. The work we do together is truly meaningful. ”
Data Engineering Podcast
FEBRUARY 18, 2024
Join in with the event for the global data community, Data Council Austin. Don't miss out on their only event this year! What are the differences in terms of pipeline design/access and usage patterns when using a Trino/Iceberg lakehouse as compared to other popular warehouse/lakehouse structures?
Monte Carlo
NOVEMBER 12, 2024
While not every company needs to process millions of events per second, understanding these advanced architectures helps us make better decisions about our own data infrastructure, whether we’re handling user recommendations, ride-sharing logistics, or simply figuring out which meeting rooms are actually being used.
The Pragmatic Engineer
OCTOBER 10, 2024
Wordpress.org – which has a complex, intertwined setup with Automattic, and was also cofounded by Matt Mullenweg – bans WP Engine from accessing its plugin repository and updates infrastructure. 25 Sep: Block. This penalizes WP Engine customers. 27 Sep: Unblock. Wordpress.org
Snowflake
OCTOBER 30, 2023
We are excited to announce the general availability of Snowflake Event Tables for logging and tracing, an essential feature to boost application observability and supportability for Snowflake developers. Developers can then query Event Tables to troubleshoot their applications or gain insights into performance and code behavior.
The Pragmatic Engineer
NOVEMBER 21, 2023
The rise of the internet and web-based computing (1989) ARPANET's influence expanded in the 1980s when the National Science Foundation Network (NSFNet) provided access to a network of supercomputers across the US. Third-party events like “ Oprah’s book club ” could send traffic and orders spiking 1000s of percent.
Data Engineering Podcast
DECEMBER 24, 2023
Summary Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. In the event of these different cluster errors, what are the strategies for mitigating and recovering from those failures? Operating it at scale, however, is notoriously challenging.
The Pragmatic Engineer
JANUARY 12, 2023
Pollen was an events tech startup founded in 2015, which raised more than $200M in funding and employed about 600 people by 2022. It defied gravity by appearing to thrive at the same time as the Covid-19 pandemic shut down swathes of the events industry, worldwide. To get this newsletter every week, subscribe here.
The Pragmatic Engineer
APRIL 11, 2024
Confirming this suspicion is Fabrick’s status page that says Fabrick’s tech staff are involved: “We would like to inform you that the event is currently still ongoing, continuous monitoring is in place, and any slowdowns are affecting only the services provided jointly with the Sella group.
The Pragmatic Engineer
MARCH 13, 2023
In case you somehow missed it: we went through the fastest bank run in history, in an event that impacted about half of all VC-funded startups in the US and UK. In this special edition, I cover the events of this blow to the tech economy. For some startups, losing access to their bank account prompted drastic action.
Data Engineering Podcast
NOVEMBER 26, 2023
Summary Building a data platform that is enjoyable and accessible for all of its end users is a substantial challenge. Developing event-driven pipelines is going to be a lot easier - Meet Functions! Memphis Logo]([link] Developing event-driven pipelines is going to be a lot easier - Meet Functions!
The Pragmatic Engineer
APRIL 9, 2024
On top of the ability to post jobs, they had access to The Pragmatic Engineer Talent Collective. As I analyzed this event in The Pragmatic Engineer: Every hiring manager I talked with told me how Stack Overflow Jobs has been the place where job ads got good candidates.
Workfall
SEPTEMBER 26, 2023
Reading Time: 9 minutes In this blog, we will cover: What are Server-Sent Events? Why Stream Data Using Server-Sent Events (SSE)? Hands-On Conclusion What are Server-Sent Events? These events can carry information like notifications, updates, or alerts. This makes it accessible to a wider range of developers and projects.
Data Engineering Podcast
APRIL 16, 2023
As the capabilities of these systems has improved and become more accessible, the target of what self-serve means changes. Their SDKs make event streaming from any app or website easy, and their extensive library of integrations enable you to automatically send data to hundreds of downstream tools.
Data Engineering Podcast
MARCH 3, 2024
Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Data Council Logo]([link] Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Data Council : ![Data
Data Engineering Podcast
FEBRUARY 25, 2024
Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Data Council Logo]([link] Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Promo Code: dataengpod20 Dagster : ![Dagster
The Pragmatic Engineer
MAY 25, 2023
Organize a “Glassdoor review event,” asking employees to leave honest reviews. Engineering leaders and engineers shared some stories: A startup organized a social event at work, and asked employees to show up on time with their laptops. However, there are companies which employ more questionable tactics. Exclusive A 1.5
Data Engineering Podcast
MARCH 18, 2023
Announcements Hello and welcome to the Data Engineering Podcast, the show about modern data management Join in with the event for the global data community, Data Council Austin. Don't miss out on their only event this year! Data Council Logo]([link] Join us at the event for the global data community, Data Council Austin.
Snowflake
JANUARY 29, 2024
This new connector offers immediate access to your Google Analytics data without the hassle, complexity and expense of manual integration via API endpoints and patched systems workarounds. Once your data reaches its final destination, access to the ingested data is governed by Snowflake’s built-in access controls.
Netflix Tech
NOVEMBER 12, 2024
By: Rajiv Shringi , Oleksii Tkachuk , Kartik Sathyanarayanan Introduction In our previous blog post, we introduced Netflix’s TimeSeries Abstraction , a distributed service designed to store and query large volumes of temporal event data with low millisecond latencies. Today, we’re excited to present the Distributed Counter Abstraction.
The Pragmatic Engineer
OCTOBER 17, 2024
The startup was able to start operations thanks to getting access to an EU grant called NGI Search grant. Read the full issue here , and read this week’s The Pulse — all of them analyzing relevant events happening in the tech industry. This was one section from last week’s The Pulse.
Snowflake
JUNE 5, 2024
We couldn’t make our advancements in AI, data accessibility, monetization and more without their ongoing support and partnership, and this recognition could not be more well-deserved. The SPN has experienced incredible growth and is crucial to our most important value, putting the customer first.
Snowflake
JUNE 5, 2024
At the same time, organizations must ensure the right people have access to the right content, while also protecting sensitive and/or Personally Identifiable Information (PII) and fulfilling a growing list of regulatory requirements.
Robinhood
MARCH 6, 2024
The news cycle, world events, and market moving events like earnings often happen outside of US East Coast business hours. Here’s how to access the Robinhood 24 Hour Market in-app: Select your stock or ETF. 24 Hour Market lets customers invest when they want, on their schedule.
Data Engineering Podcast
MARCH 17, 2024
Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Data Council Logo]([link] Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Data Council : ![Data
Data Engineering Podcast
MARCH 10, 2024
Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Data Council Logo]([link] Join us at the top event for the global data community, Data Council Austin. Don't miss out on our only event this year! Promo Code: dataengpod20 Dagster : ![Dagster
The Pragmatic Engineer
OCTOBER 31, 2023
Diagnosis: Customers may be unable to access Cloud resources in europe-west9-a Workaround: Customers can fail over to other zones.” We are still working on processing the backlog of asynchronous Lambda invocations that accumulated during the event, including invocations from other AWS services (such as SQS and EventBridge).
Knowledge Hut
MARCH 22, 2024
These locks are designed to automatically unlock during emergencies or power outages, ensuring swift evacuation or access for first responders. These sophisticated locks operate on the principle of maintaining secure access control, even in the event of power failure or system malfunction.
Engineering at Meta
OCTOBER 22, 2024
Traditionally, WhatsApp has lacked the ability to store your contact list in a way that can be easily and automatically restored in the event you lose it. Contact names are stored encrypted within WhatsApp, and we’ve built this with additional, robust protections by using IPLS to deter access to contacts to anyone except the user.
Netflix Tech
SEPTEMBER 10, 2024
During this event, we generate a timestamp and store it in an eBPF hash map using the process ID as the key. This feature enables access to a wealth of information stored about a process. To safely access this RCU-protected information, we can leverage kfuncs in eBPF.
Robinhood
OCTOBER 21, 2024
Historically, UK retail investors have had challenges accessing features like margin investing that provide other ways to help maximise their investments. At Robinhood we understand that investors want access to expand and diversify their portfolios at industry leading rates, in an amazing user experience.”
Data Engineering Podcast
MARCH 24, 2024
Trusted by the teams at Comcast and Doordash, Starburst delivers the adaptability and flexibility a lakehouse ecosystem promises, while providing a single point of access for your data and all your data governance allowing you to discover, transform, govern, and secure all in one place.
Cloudera
MAY 10, 2024
Building an open data lakehouse with Iceberg delivers significant benefits increasing self-service access, ease of use, flexibility, and delivering unified security and governance for all data. From data pipelines into Iceberg to data governance, the event will hit on a broad range of topics surrounding Iceberg.
Cloudera
MARCH 6, 2024
As we enter into a new month, the Cloudera team is getting ready to head off to the Gartner Data & Analytics Summit in Orlando, Florida for one of the most important events of the year for Chief Data Analytics Officers (CDAOs) and the field of data and analytics.
Snowflake
NOVEMBER 15, 2023
Register now to get access to sessions on AI and ML, Snowpark, Iceberg, streaming, Snowflake Native Apps and more. Anyone registered for BUILD will have access to the bootcamp—but there is limited capacity. More Meetups are being added all the time—we just listed an event in Pune, India—so check out [link] to find a city near you.
Netflix Tech
JUNE 1, 2023
New content or national events may drive brief spikes, but, by and large, traffic is usually smoothly increasing or decreasing. It also included metadata about ads, such as ad placement and impression-tracking events. We stored these responses in a Keystone stream with outputs for Kafka and Elasticsearch.
Robinhood
FEBRUARY 27, 2024
Money Drills will kick off at each university this week with events and speakers sharing more on how the program will come to life on campus. “By We’re proud to announce our work with Florida State University, Coastal Carolina University, and University of California, Berkeley to bring access to financial education to even more students.” “The
Zalando Engineering
NOVEMBER 8, 2023
Postgres Logical Replication at Zalando Builders at Zalando have access to a low-code solution that allows them to declare event streams that source from Postgres databases. At the time of writing, there are hundreds of these Postgres-sourced event streams out in the wild at Zalando. Simple, right?
Data Engineering Podcast
JANUARY 28, 2024
Summary Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases.
Knowledge Hut
MAY 22, 2024
Confidentiality Confidentiality in information security assures that information is accessible only by authorized individuals. Simply put, it’s about maintaining access to data to block unauthorized disclosure. Contrariwise, an adequate system also assures that those who need to have access should have the required privileges.
Knowledge Hut
MAY 3, 2024
Kafka offers better fault tolerance because of its event-driven processing. Processing Type Kafka analyses events as they often take place. Stream processing is highly beneficial if the events you wish to track are happening frequently and close together in time. A continuous processing model is an outcome.
Netflix Tech
OCTOBER 8, 2024
Rajiv Shringi Vinay Chella Kaidan Fullerton Oleksii Tkachuk Joey Lynch Introduction As Netflix continues to expand and diversify into various sectors like Video on Demand and Gaming , the ability to ingest and store vast amounts of temporal data — often reaching petabytes — with millisecond access latency has become increasingly vital.
Expert insights. Personalized for you.
We have resent the email to
Are you sure you want to cancel your subscriptions?
Let's personalize your content